Taro Logo

Principal Deep Learning Software Engineer, LLM Performance

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.
$272,000 - $425,500
Machine Learning
Principal Software Engineer
Hybrid
5,000+ Employees
12+ years of experience
AI

Description For Principal Deep Learning Software Engineer, LLM Performance

NVIDIA is seeking an experienced Principal Deep Learning Engineer to join their team, focusing on analyzing and improving the performance of LLM inference. This role sits at the intersection of deep learning and high-performance computing, working with cutting-edge LLM technologies and NVIDIA's GPU platforms.

The position involves working with state-of-the-art large language models and implementing performance optimizations across NVIDIA's range of accelerators, from datacenter GPUs to edge SoCs. You'll be collaborating with the deep learning community to implement the latest algorithms for public release in TensorRT LLM, VLLM, SGLang, and LLM benchmarks.

As a Principal Engineer, you'll be responsible for scaling performance across different architectures, optimizing for maximum throughput and minimum latency, and contributing to both NVIDIA and open-source LLM frameworks. The role requires deep expertise in Python/C/C++ programming and experience with deep learning frameworks like PyTorch, JAX, or TensorFlow.

NVIDIA's position as the "AI computing company" makes this an exciting opportunity to work on technology that's transforming industries. You'll be part of the team that enables the performance optimization, deployment, and serving of deep learning solutions used by companies worldwide.

The role offers competitive compensation with a base salary range of $272,000 - $425,500 USD, plus equity and benefits. Working in a hybrid environment, you'll collaborate with diverse teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions.

This is an opportunity to work at the forefront of AI technology, specifically in the rapidly growing field of large language models, while contributing to software that powers breakthroughs in areas like Generative AI, Recommenders, and Computer Vision.

Last updated 39 minutes ago

Responsibilities For Principal Deep Learning Software Engineer, LLM Performance

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment
  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators
  • Scale performance for max throughput, minimum latency and throughput under latency constraints
  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton
  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding

Requirements For Principal Deep Learning Software Engineer, LLM Performance

Python
Linux
  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
  • At least 12 years of relevant software development experience
  • Excellent Python/C/C++ programming, software design and software engineering skills
  • Experience with a DL framework like PyTorch, JAX, TensorFlow

Benefits For Principal Deep Learning Software Engineer, LLM Performance

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Principal Deep Learning Software Engineer, LLM Performance

Principal Generative-AI Software Engineer

Principal Generative-AI Software Engineer position at NVIDIA, focusing on developing cutting-edge AI models and systems with competitive compensation and opportunity to work on breakthrough technologies.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Lead machine learning role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Principal Machine Learning Engineer - Enterprise AI

Principal Machine Learning Engineer position at NVIDIA focusing on Enterprise AI solutions.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Principal-level machine learning engineering role focused on prediction and planning systems for autonomous vehicles at NVIDIA.

Principal Deep Learning Software Engineer, LLM Performance

Principal Deep Learning Software Engineer position focused on LLM Performance optimization at NVIDIA's Santa Clara location.