Senior On-Device Model Inference Optimization Engineer

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Santa Clara, CA, USA

$184,000 - $356,500

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

10+ years of experience

AI · Automotive

Description For Senior On-Device Model Inference Optimization Engineer

NVIDIA, a pioneer in computer graphics and accelerated computing for over 25 years, is seeking a Senior On-Device Model Inference Optimization Engineer to drive innovation in autonomous vehicles technology. This role combines deep technical expertise in AI model optimization with practical implementation skills for on-device deployment. The position offers an opportunity to work at the intersection of AI and automotive technology, optimizing critical systems that power the future of autonomous vehicles.

The role demands expertise in advanced optimization techniques like pruning, quantization, and knowledge distillation, along with strong programming skills in CUDA, Python, and C++. You'll be working with cutting-edge frameworks including PyTorch, ONNX, and TensorRT, while collaborating with cross-functional teams to align optimization efforts with hardware capabilities.

As an NVIDIAN, you'll join a diverse and supportive environment where innovation is celebrated. The position offers competitive compensation with a base salary range of $184,000 - $356,500 USD, plus equity benefits. This is an excellent opportunity for experienced engineers passionate about pushing the boundaries of AI optimization and autonomous vehicle technology.

The ideal candidate brings 10+ years of relevant experience, with at least 5 years specifically in model inference and optimization. You should have an advanced degree in Computer Science or related field, and a proven track record of deploying optimized AI models at scale. The role requires both technical excellence and strong collaborative skills, as you'll be working across teams to deliver efficient, production-ready solutions for safety-critical systems.

Last updated 20 hours ago

Responsibilities For Senior On-Device Model Inference Optimization Engineer

Develop and implement strategies to optimize AI model inference for on-device deployment
Employ techniques like pruning, quantization, and knowledge distillation
Optimize performance-critical components using CUDA and C++
Collaborate with multi-functional teams
Benchmark inference performance and identify bottlenecks
Research and apply innovative methods for inference optimization
Adapt models for diverse hardware platforms
Create tools to validate accuracy and latency of deployed models
Recommend and implement model architecture changes

Requirements For Senior On-Device Model Inference Optimization Engineer

Python

MSc or PhD in Computer Science, Engineering, or related field
Over 5 years of confirmed experience in model inference and optimization
10+ overall years of work experience
Expertise in PyTorch, ONNX, and TensorRT
Experience in optimizing inference for transformer and convolutional architectures
Strong programming proficiency in CUDA, Python, and C++
In-depth knowledge of optimization techniques
Skilled in building and deploying scalable cloud-based inference systems
Strong collaboration and communication skills
Meticulous attention to detail

Benefits For Senior On-Device Model Inference Optimization Engineer

Equity

Equity

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Santa Clara, CA, USA

$184,000 - $356,500

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

10+ years of experience

AI · Automotive

Interested in this job?

Jobs Related To NVIDIA Senior On-Device Model Inference Optimization Engineer

Senior Software Engineer, LLM Inference

NVIDIA

Senior Software Engineer position at NVIDIA focusing on LLM Inference development, optimization, and implementation for AI and autonomous vehicle applications.

Senior Perception Engineer

NVIDIA

Senior Perception Engineer role at NVIDIA developing autonomous driving solutions using deep learning and computer vision, offering competitive salary and opportunity to work on cutting-edge technology.

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Senior ML Engineer position at NVIDIA focusing on prediction and planning systems for autonomous vehicles

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Senior ML Engineer role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Senior Software Engineer, LLM Inference

NVIDIA

Senior Software Engineer position focused on Large Language Model (LLM) inference at NVIDIA.