Principal Deep Learning Software Engineer, LLM Performance

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Santa Clara, CA, USA

$272,000 - $425,500

Machine Learning

Principal Software Engineer

Hybrid

5,000+ Employees

12+ years of experience

Description For Principal Deep Learning Software Engineer, LLM Performance

NVIDIA is seeking an experienced Principal Deep Learning Engineer to join their team focusing on analyzing and improving the performance of LLM inference. This role is at the forefront of NVIDIA's rapidly growing research and development in Deep Learning Inference. The position involves working with GPU-accelerated Deep Learning software like TensorRT, developing DL benchmarking software, and creating performant solutions for model deployment and serving.

The role requires collaboration with the deep learning community to implement cutting-edge algorithms in TensorRT LLM, VLLM, SGLang, and LLM benchmarks. You'll be responsible for identifying performance opportunities and optimizing state-of-the-art LLM models across NVIDIA's range of accelerators, from datacenter GPUs to edge SoCs. The work involves implementing LLM inference, serving, and deployment algorithms using various frameworks and CUDA kernels.

As a Principal Engineer, you'll work with diverse teams in performance modeling, analysis, kernel development, and inference software development. The position offers the opportunity to contribute to NVIDIA's mission of advancing AI computing, working with the technology that powers breakthroughs in LLM, Generative AI, Recommenders, and Vision applications.

The role comes with competitive compensation, including a base salary range of $272,000 - $425,500 USD, plus equity and benefits. This hybrid position is based in Santa Clara, CA, offering the flexibility of both office and remote work. Join NVIDIA in shaping the future of AI computing and be part of a team that's transforming industries through innovative deep learning solutions.

Last updated an hour ago

Responsibilities For Principal Deep Learning Software Engineer, LLM Performance

Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment
Scale performance of LLM models across different architectures and types of NVIDIA accelerators
Scale performance for max throughput, minimum latency and throughput under latency constraints
Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton
Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding

Requirements For Principal Deep Learning Software Engineer, LLM Performance

Python

Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
At least 12 years of relevant software development experience
Excellent Python/C/C++ programming, software design and software engineering skills
Experience with a DL framework like PyTorch, JAX, TensorFlow

Benefits For Principal Deep Learning Software Engineer, LLM Performance

Medical Insurance

Equity

Competitive base salary range $272,000 - $425,500
Equity compensation
Comprehensive benefits package
Hybrid work arrangement

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.

Santa Clara, CA, USA

$272,000 - $425,500

Machine Learning

Principal Software Engineer

Hybrid

5,000+ Employees

12+ years of experience

Interested in this job?

Jobs Related To NVIDIA Principal Deep Learning Software Engineer, LLM Performance

Principal Generative-AI Software Engineer

NVIDIA

Principal Generative-AI Software Engineer role at NVIDIA focusing on multimodal learning, video generation, and intelligent simulation.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Lead machine learning role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Principal ML Engineering role at NVIDIA focusing on prediction and planning systems for autonomous vehicles

Principal Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Principal ML Engineering role at NVIDIA focusing on prediction and planning systems for autonomous vehicles.

Principal Machine Learning Engineer - Enterprise AI

NVIDIA

Principal Machine Learning Engineer position at NVIDIA focusing on Enterprise AI solutions.