Senior On-Device Model Inference Optimization Engineer

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.
$184,000 - $356,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
10+ years of experience
AI · Automotive

Description For Senior On-Device Model Inference Optimization Engineer

NVIDIA, a pioneer in computer graphics and accelerated computing for over 25 years, is seeking a Senior On-Device Model Inference Optimization Engineer to drive innovation in autonomous vehicles technology. This role combines deep technical expertise in AI model optimization with practical implementation skills for on-device deployment. The position offers an opportunity to work at the intersection of AI and automotive technology, optimizing critical systems that power the future of autonomous vehicles.

The role demands expertise in advanced optimization techniques like pruning, quantization, and knowledge distillation, along with strong programming skills in CUDA, Python, and C++. You'll be working with cutting-edge frameworks including PyTorch, ONNX, and TensorRT, while collaborating with cross-functional teams to align optimization efforts with hardware capabilities.

As an NVIDIAN, you'll join a diverse and supportive environment where innovation is celebrated. The position offers competitive compensation with a base salary range of $184,000 - $356,500 USD, plus equity benefits. This is an excellent opportunity for experienced engineers passionate about pushing the boundaries of AI optimization and autonomous vehicle technology.

The ideal candidate brings 10+ years of relevant experience, with at least 5 years specifically in model inference and optimization. You should have an advanced degree in Computer Science or related field, and a proven track record of deploying optimized AI models at scale. The role requires both technical excellence and strong collaborative skills, as you'll be working across teams to deliver efficient, production-ready solutions for safety-critical systems.

Last updated 20 hours ago

Responsibilities For Senior On-Device Model Inference Optimization Engineer

  • Develop and implement strategies to optimize AI model inference for on-device deployment
  • Employ techniques like pruning, quantization, and knowledge distillation
  • Optimize performance-critical components using CUDA and C++
  • Collaborate with multi-functional teams
  • Benchmark inference performance and identify bottlenecks
  • Research and apply innovative methods for inference optimization
  • Adapt models for diverse hardware platforms
  • Create tools to validate accuracy and latency of deployed models
  • Recommend and implement model architecture changes

Requirements For Senior On-Device Model Inference Optimization Engineer

Python
  • MSc or PhD in Computer Science, Engineering, or related field
  • Over 5 years of confirmed experience in model inference and optimization
  • 10+ overall years of work experience
  • Expertise in PyTorch, ONNX, and TensorRT
  • Experience in optimizing inference for transformer and convolutional architectures
  • Strong programming proficiency in CUDA, Python, and C++
  • In-depth knowledge of optimization techniques
  • Skilled in building and deploying scalable cloud-based inference systems
  • Strong collaboration and communication skills
  • Meticulous attention to detail

Benefits For Senior On-Device Model Inference Optimization Engineer

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior On-Device Model Inference Optimization Engineer

Senior Software Engineer, LLM Inference

Senior Software Engineer position at NVIDIA focusing on LLM Inference development, optimization, and implementation for AI and autonomous vehicle applications.

Senior Perception Engineer

Senior Perception Engineer role at NVIDIA developing autonomous driving solutions using deep learning and computer vision, offering competitive salary and opportunity to work on cutting-edge technology.

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Senior ML Engineer position at NVIDIA focusing on prediction and planning systems for autonomous vehicles

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Senior ML Engineer role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Senior Software Engineer, LLM Inference

Senior Software Engineer position focused on Large Language Model (LLM) inference at NVIDIA.