NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior On-Device Model Inference Optimization Engineer to drive innovation in autonomous vehicles technology. This role combines deep technical expertise in AI model optimization with practical implementation skills, focusing on making AI models more efficient and performant for on-device deployment.
The position requires a blend of theoretical knowledge and hands-on experience in machine learning, with a particular focus on model optimization techniques such as pruning, quantization, and knowledge distillation. The ideal candidate will have extensive experience with CUDA, C++, and Python programming, along with expertise in modern ML frameworks like PyTorch, ONNX, and TensorRT.
Working at NVIDIA means joining a team that has been transforming computer graphics, PC gaming, and accelerated computing for over 25 years. The company is now leading the charge in AI and autonomous vehicles, creating technology that powers the next generation of self-driving cars and robotics.
This role offers the opportunity to work on cutting-edge technology while collaborating with multi-functional teams across the organization. The successful candidate will be responsible for developing and implementing optimization strategies, benchmarking performance, and creating tools for model validation at scale. They will also play a crucial role in adapting models for various hardware platforms and operating systems.
NVIDIA offers a supportive and diverse environment where innovation is encouraged and individual contributions are valued. The company's legacy of technological advancement and commitment to pushing boundaries makes this an exciting opportunity for someone passionate about optimization and AI technology.