NVIDIA is seeking a Senior Applied AI Software Engineer to join their Dynamo project, an innovative open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. This role sits at the intersection of cutting-edge AI infrastructure and distributed systems engineering.
The position involves working on sophisticated challenges in distributed inference, including building Kubernetes deployment systems, developing scalable workload management solutions, and optimizing GPU resource utilization. You'll be responsible for architecting disaggregated serving systems, implementing dynamic GPU scheduling, and enhancing intelligent routing systems for efficient inference request handling.
As a senior engineer, you'll contribute to both the Python SDK and Rust Runtime Core Library, working with various LLM frameworks like TensorRT-LLM, vLLM, and SGLang. The role requires expertise in distributed systems, parallel computing, and GPU architectures, with hands-on experience in systems programming using Rust, Python, and Go.
NVIDIA offers a highly competitive compensation package, with base salary ranging from $148,000 to $287,500 USD, plus equity and comprehensive benefits. The company is known for being one of the technology world's most desirable employers, offering opportunities to work on transformative AI technologies that impact various industries.
The ideal candidate will have 5+ years of experience, strong technical skills in distributed systems and AI infrastructure, and a track record of contributing to open-source projects. This position offers the opportunity to shape the future of AI inference systems while working with a world-class team at a leading technology company.