NVIDIA is seeking a Senior Software Engineer to join their Deep Learning Inference team, focusing on building and optimizing their state-of-the-art inference framework TensorRT-LLM. This role combines cutting-edge AI technology with high-performance computing, specifically targeting Large Language Models (LLMs) on NVIDIA GPUs. The position offers an opportunity to work with industry-leading technology while contributing to open-source development.
The role involves developing components for TensorRT-LLM, NVIDIA's premier library for optimizing LLM inference performance, managing the open-source repository, and providing expert technical support to users. You'll collaborate with diverse teams including deep learning experts, GPU architects, and DevOps engineers, both within NVIDIA and the broader deep learning community.
Ideal candidates should bring 6+ years of software development experience, strong Python skills, and deep understanding of Machine Learning concepts, particularly in LLMs. Experience with C++, open-source development, and ML frameworks like vLLM, TensorRT, PyTorch, or JAX is highly valued. The position offers competitive compensation ranging from $184,000 to $287,500, plus equity and benefits.
NVIDIA is renowned as one of technology's most desirable employers, offering the chance to work with forward-thinking professionals in a collaborative environment. This role presents an excellent opportunity for those passionate about AI and high-performance computing to make significant contributions to the field of deep learning while working with cutting-edge technology at a leading company.