Taro Logo

Principal Deep Learning Software Engineer, LLM Performance

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.
$272,000 - $425,500
Machine Learning
Principal Software Engineer
Hybrid
5,000+ Employees
12+ years of experience
AI

Description For Principal Deep Learning Software Engineer, LLM Performance

NVIDIA is seeking an experienced Principal Deep Learning Engineer to join their team focusing on analyzing and improving the performance of LLM inference. This role is at the forefront of NVIDIA's rapidly growing research and development in Deep Learning Inference. The position involves working with GPU-accelerated Deep Learning software like TensorRT, developing DL benchmarking software, and creating performant solutions for model deployment and serving.

The role requires collaboration with the deep learning community to implement cutting-edge algorithms in TensorRT LLM, VLLM, SGLang, and LLM benchmarks. You'll be responsible for identifying performance opportunities and optimizing state-of-the-art LLM models across NVIDIA's range of accelerators, from datacenter GPUs to edge SoCs. The work involves implementing LLM inference, serving, and deployment algorithms using various frameworks and CUDA kernels.

As a Principal Engineer, you'll work with diverse teams in performance modeling, analysis, kernel development, and inference software development. The position offers the opportunity to contribute to NVIDIA's mission of advancing AI computing, working with the technology that powers breakthroughs in LLM, Generative AI, Recommenders, and Vision applications.

The role comes with competitive compensation, including a base salary range of $272,000 - $425,500 USD, plus equity and benefits. This hybrid position is based in Santa Clara, CA, offering the flexibility of both office and remote work. Join NVIDIA in shaping the future of AI computing and be part of a team that's transforming industries through innovative deep learning solutions.

Last updated an hour ago

Responsibilities For Principal Deep Learning Software Engineer, LLM Performance

  • Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment
  • Scale performance of LLM models across different architectures and types of NVIDIA accelerators
  • Scale performance for max throughput, minimum latency and throughput under latency constraints
  • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton
  • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding

Requirements For Principal Deep Learning Software Engineer, LLM Performance

Python
  • Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
  • At least 12 years of relevant software development experience
  • Excellent Python/C/C++ programming, software design and software engineering skills
  • Experience with a DL framework like PyTorch, JAX, TensorFlow

Benefits For Principal Deep Learning Software Engineer, LLM Performance

Medical Insurance
Equity
  • Competitive base salary range $272,000 - $425,500
  • Equity compensation
  • Comprehensive benefits package
  • Hybrid work arrangement

Interested in this job?

Jobs Related To NVIDIA Principal Deep Learning Software Engineer, LLM Performance

Principal Generative-AI Software Engineer

Principal Generative-AI Software Engineer role at NVIDIA focusing on multimodal learning, video generation, and intelligent simulation.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Lead machine learning role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Principal ML Engineering role at NVIDIA focusing on prediction and planning systems for autonomous vehicles

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

Principal ML Engineering role at NVIDIA focusing on prediction and planning systems for autonomous vehicles.

Principal Machine Learning Engineer - Enterprise AI

Principal Machine Learning Engineer position at NVIDIA focusing on Enterprise AI solutions.