Taro Logo

Senior Deep Learning Software Engineer, LLM Performance

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.
$184,000 - $356,500
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
8+ years of experience
AI

Description For Senior Deep Learning Software Engineer, LLM Performance

NVIDIA is seeking an experienced Senior Deep Learning Engineer focused on analyzing and improving LLM inference performance. As a world leader in accelerated computing and AI, NVIDIA's GPUs power breakthroughs in deep learning, particularly in LLM, Generative AI, Recommenders and Vision technologies. The role involves working with cutting-edge LLM frameworks like TensorRT LLM, VLLM, and SGLang to optimize performance across NVIDIA's GPU portfolio.

The position requires strong expertise in deep learning software development, with emphasis on performance optimization and scaling of large language models. You'll be collaborating with diverse teams on performance modeling, analysis, and kernel development, while contributing to both NVIDIA and open-source LLM frameworks.

This is an exciting opportunity to work at the forefront of AI computing, helping build the platforms that enable real-time, cost-effective computing solutions. You'll be part of the team that's driving innovation in GPU deep learning, which has become fundamental to machine perception, reasoning, and natural language processing.

The role offers competitive compensation with a base salary range of $184,000 - $356,500 USD, plus equity and benefits. NVIDIA's commitment to diversity and inclusion, combined with their position as "the AI computing company," makes this an excellent opportunity for those passionate about advancing the field of deep learning and AI technology.

Last updated 8 days ago

Responsibilities For Senior Deep Learning Software Engineer, LLM Performance

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment
  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators
  • Scale performance for max throughput, minimum latency and throughput under latency constraints
  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton
  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding

Requirements For Senior Deep Learning Software Engineer, LLM Performance

Python
  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
  • At least 8 years of relevant software development experience
  • Excellent Python/C/C++ programming, software design and software engineering skills
  • Experience with a DL framework like PyTorch, JAX, TensorFlow

Benefits For Senior Deep Learning Software Engineer, LLM Performance

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior Deep Learning Software Engineer, LLM Performance