NVIDIA is seeking an experienced Senior Deep Learning Engineer to join their team focusing on analyzing and improving the performance of LLM inference. As a global leader in GPU technology and AI computing, NVIDIA is at the forefront of the deep learning revolution, enabling breakthroughs in LLM, Generative AI, Recommenders, and Vision technologies.
In this role, you'll be working with cutting-edge LLM frameworks like TensorRT LLM, VLLM, and SGLang, optimizing state-of-the-art LLM models across NVIDIA's spectrum of accelerators. You'll collaborate with diverse teams in performance modeling, analysis, and kernel development, contributing to the development of GPU-accelerated deep learning software that powers AI solutions worldwide.
The position requires strong expertise in Python/C/C++ programming and deep learning frameworks, with at least 8 years of relevant software development experience. You'll be responsible for scaling performance across different architectures, optimizing for maximum throughput and minimum latency, and contributing to both NVIDIA and open-source LLM frameworks.
NVIDIA offers a competitive compensation package with a base salary range of $184,000 - $356,500 USD, plus equity and comprehensive benefits. The company is committed to fostering a diverse work environment and is an equal opportunity employer. This is an excellent opportunity to join a leading technology company that's driving innovation in AI and deep learning, working with state-of-the-art technology and contributing to solutions that are transforming industries worldwide.