Senior On-Device Model Inference Optimization Engineer

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

8+ years of experience

AI · Automotive

Description For Senior On-Device Model Inference Optimization Engineer

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior On-Device Model Inference Optimization Engineer to drive innovation in autonomous vehicles technology. This role combines deep technical expertise in AI model optimization with practical implementation skills, focusing on making AI models more efficient and performant for on-device deployment.

The position requires a blend of theoretical knowledge and hands-on experience in machine learning, with a particular focus on model optimization techniques such as pruning, quantization, and knowledge distillation. The ideal candidate will have extensive experience with CUDA, C++, and Python programming, along with expertise in modern ML frameworks like PyTorch, ONNX, and TensorRT.

Working at NVIDIA means joining a team that has been transforming computer graphics, PC gaming, and accelerated computing for over 25 years. The company is now leading the charge in AI and autonomous vehicles, creating technology that powers the next generation of self-driving cars and robotics.

This role offers the opportunity to work on cutting-edge technology while collaborating with multi-functional teams across the organization. The successful candidate will be responsible for developing and implementing optimization strategies, benchmarking performance, and creating tools for model validation at scale. They will also play a crucial role in adapting models for various hardware platforms and operating systems.

NVIDIA offers a supportive and diverse environment where innovation is encouraged and individual contributions are valued. The company's legacy of technological advancement and commitment to pushing boundaries makes this an exciting opportunity for someone passionate about optimization and AI technology.

Last updated a day ago

Responsibilities For Senior On-Device Model Inference Optimization Engineer

Develop and implement strategies to optimize AI model inference for on-device deployment
Employ techniques like pruning, quantization, and knowledge distillation
Optimize performance-critical components using CUDA and C++
Collaborate with multi-functional teams
Benchmark inference performance and identify bottlenecks
Research and apply innovative methods for inference optimization
Adapt models for diverse hardware platforms
Create tools to validate accuracy and latency of deployed models
Recommend and implement model architecture changes

Requirements For Senior On-Device Model Inference Optimization Engineer

Python

Kubernetes

MSc or PhD in Computer Science, Engineering, or related field
Over 5 years of experience in model inference and optimization
8+ years of overall work experience
Expertise in PyTorch, ONNX, and TensorRT
Experience in optimizing transformer and convolutional architectures
Strong programming proficiency in CUDA, Python, and C++
Knowledge of optimization techniques
Skilled in building cloud-based inference systems
Strong collaboration and communication skills
Meticulous attention to detail

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

8+ years of experience

AI · Automotive

Interested in this job?

Senior On-Device Model Inference Optimization Engineer

NVIDIA

Description For Senior On-Device Model Inference Optimization Engineer

Responsibilities For Senior On-Device Model Inference Optimization Engineer

Requirements For Senior On-Device Model Inference Optimization Engineer

NVIDIA

Jobs Related To NVIDIA Senior On-Device Model Inference Optimization Engineer