Taro Logo

Senior Deep Learning Software Engineer, Inference

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
$148,000 - $287,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI

Description For Senior Deep Learning Software Engineer, Inference

NVIDIA is seeking a Senior Deep Learning Software Engineer specializing in inference optimization to join their growing team. This role sits at the intersection of cutting-edge AI technology and high-performance computing, where you'll be instrumental in developing and optimizing GPU-accelerated software that powers sophisticated AI applications.

The position involves working with advanced deep learning frameworks like SGLang and vLLM, which are crucial for efficient large-scale model serving and inference. You'll be responsible for implementing and optimizing state-of-the-art LLM and Generative AI models across NVIDIA's range of accelerators, from datacenter GPUs to edge SoCs.

As a senior engineer, you'll collaborate with the deep learning community to implement cutting-edge algorithms and drive performance improvements. The role requires expertise in C/C++ programming, deep learning frameworks, and GPU optimization techniques. You'll work with tools like CUTLASS, OAI Triton, NCCL, and CUDA kernels to build and optimize model serving pipelines.

The position offers a competitive salary range of $148,000 to $287,500 USD, along with equity and comprehensive benefits. NVIDIA is known for being one of the technology world's most desirable employers, offering opportunities to work on groundbreaking AI technologies that transform industries.

This role is perfect for someone with 5+ years of relevant experience, strong programming skills, and a deep understanding of AI/ML technologies. You'll be joining a forward-thinking team at NVIDIA's Santa Clara location, where you'll have the opportunity to impact the future of AI acceleration and inference optimization.

The ideal candidate will have a Masters or PhD in a relevant field, experience with deep learning model optimization, and a track record of contributing to significant software projects. Experience with multi-GPU communications and performance optimization would be particularly valuable.

Last updated a day ago

Responsibilities For Senior Deep Learning Software Engineer, Inference

  • Performance optimization, analysis, and tuning of DL models in various domains like LLM, Multimodal and Generative AI
  • Scale performance of DL models across different architectures and types of NVIDIA accelerators
  • Contribute features and code to NVIDIA's inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions
  • Work with cross-collaborative teams across frameworks, NVIDIA libraries and inference optimization innovative solutions

Requirements For Senior Deep Learning Software Engineer, Inference

Python
Linux
  • Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI)
  • 5+ years of relevant software development experience
  • Excellent C/C++ programming and software design skills
  • Software Agile skills and Python experience
  • Experience with training, deploying or optimizing the inference of DL models in production
  • Background with performance modeling, profiling, debug, and code optimization
  • GPU programming experience (CUDA, OAI TRITON or CUTLASS)

Benefits For Senior Deep Learning Software Engineer, Inference

Equity
  • Equity
  • Competitive base salary

Interested in this job?

Jobs Related To NVIDIA Senior Deep Learning Software Engineer, Inference

Senior Performance Software Engineer, Deep Learning Libraries

Senior Performance Software Engineer role at NVIDIA focusing on optimizing deep learning libraries and GPU performance, working remotely from various European locations.

Senior Deep Learning Software Engineer, Inference

Senior Deep Learning Software Engineer position at NVIDIA focusing on inference systems and AI technology development.

Developer Technology Engineer - HPC and AI

Senior Developer Technology Engineer position at NVIDIA focusing on HPC and AI, requiring 3+ years experience and expertise in parallel programming and algorithms.

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Senior ML Engineer role at NVIDIA focusing on prediction and planning systems for autonomous vehicles, combining AI expertise with automotive technology.

Senior Perception Engineer

Senior Perception Engineer position at NVIDIA focusing on developing and implementing perception algorithms for autonomous systems.