Staff MLSys Engineer - Kernel Optimization

OctoAI

OctoAI is a leading startup in the fast-paced generative AI market, delivering generative AI infrastructure to run, tune, and scale models that power AI applications.

$200,000 - $240,000

Backend

Staff Software Engineer

Remote

51 - 100 Employees

8+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Staff Software Engineer

Intuit

Staff Software Engineer position at Intuit leading GenAI agentic capability development for the Unified Interactions Platform, combining technical leadership with hands-on engineering in AI and distributed systems.

Astha-Staff Software Engineer – Backend

Intuit

Staff Software Engineer position at Intuit focusing on backend development, offering opportunity to work on innovative financial technology solutions serving millions of customers worldwide.

Software Engineering SMTS

Salesforce

Staff Software Engineering position at Salesforce focusing on backend development with ASP.NET, C#, and web services, offering hybrid work options in Dallas, TX.

Software Engineering SMTS

Salesforce

Senior Member of Technical Staff Software Engineering role at Salesforce focusing on performance testing, software development, and system optimization.

Software Engineering SMTS

Salesforce

Staff Software Engineer position at Salesforce focusing on cloud platform development and API design, offering competitive compensation and hybrid work options in the Seattle area.

Description For Staff MLSys Engineer - Kernel Optimization

OctoAI is a leading startup in the generative AI market, focused on empowering businesses to build differentiated applications with the latest AI features. Our platform, OctoAI, provides efficient AI infrastructure for running, tuning, and scaling models that power AI applications. We offer the fastest foundation models, integrated customization solutions, and world-class ML systems.

As a Staff MLSys Engineer specializing in Kernel Optimization, you'll join our Automation team to develop the most efficient engine for generative model deployment. Your focus will be on enhancing GPU performance through detailed kernel adjustments and broader system-level optimizations, including continuous batching.

Key responsibilities include:

Developing and optimizing high-performance computing kernels for GPU acceleration
Implementing and enhancing programming solutions in C/C++ and Python
Deep diving into GPU performance optimizations
Working on kernel optimizations for CUDA or other accelerators
Collaborating on machine learning compilers or frameworks (optional)

We're looking for candidates with:

Advanced degree in Computer Science, Electrical Engineering, or related field
Strong programming skills in C/C++ and Python
Deep understanding of GPU performance optimizations
Extensive experience with kernel optimizations on CUDA or other accelerators
Experience contributing to innovative projects like Cutlass, FlashAttention, FlashInfer, mlc-llm, vllm

At OctoAI, we value diversity, offer competitive compensation, and provide comprehensive benefits. Join us in shaping the future of AI infrastructure!

Last updated 9 months ago

Responsibilities For Staff MLSys Engineer - Kernel Optimization

Develop and optimize high-performance computing kernels with a focus on GPU acceleration
Implement and enhance programming solutions in C/C++ and Python
Deep dive into GPU performance optimizations to maximize efficiency and speed
Work on kernel optimizations specifically for CUDA or other accelerators
Collaborate with the team to extend and improve existing machine learning compilers or frameworks (optional)

Requirements For Staff MLSys Engineer - Kernel Optimization

Python

Bachelor's, Master's or PhD's degree in Computer Science, Electrical Engineering, or a related field
Strong programming skills in C/C++ and Python
Deep understanding and experience in GPU performance optimizations
Extensive experience with kernel optimizations on CUDA or other accelerators
Extensive experience contributing to innovative OSS/closed source projects like Cutlass, FlashAttention, FlashInfer, mlc-llm, vllm

Benefits For Staff MLSys Engineer - Kernel Optimization

Medical Insurance

Dental Insurance

Vision Insurance

401k

Parental Leave

Fully covered healthcare premiums for employees and dependents (Medical, Dental, Vision, Life Insurance, Disability Insurance)
Competitive compensation including salary, bonuses, and stock options
Flexible Spending Accounts and Health Savings Account
401(k) options
Flexible work options and hours
Generous time off policies
Company-sanctioned downtime twice a year
Company-paid holidays
Comprehensive parental leave
Volunteer Time Off (4 days a year)
Additional leaves including disability, paid family medical leave, and paid military leave

OctoAI

OctoAI is a leading startup in the fast-paced generative AI market, delivering generative AI infrastructure to run, tune, and scale models that power AI applications.

$200,000 - $240,000

Backend

Staff Software Engineer

Remote

51 - 100 Employees

8+ years of experience

Interested in this job?