OctoAI is a leading startup in the generative AI market, focused on empowering businesses to build differentiated applications with the latest AI features. Our platform, OctoAI, provides efficient AI infrastructure for running, tuning, and scaling models that power AI applications. We offer the fastest foundation models, integrated customization solutions, and world-class ML systems.
As a Staff MLSys Engineer specializing in Kernel Optimization, you'll join our Automation team to develop the most efficient engine for generative model deployment. Your focus will be on enhancing GPU performance through detailed kernel adjustments and broader system-level optimizations, including continuous batching.
Key responsibilities include:
We're looking for candidates with:
At OctoAI, we value diversity, offer competitive compensation, and provide comprehensive benefits. Join us in shaping the future of AI infrastructure!