Taro Logo

Senior Applied AI Software Engineer, Distributed Inference Systems

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
$148,000 - $287,500
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior Applied AI Software Engineer, Distributed Inference Systems

NVIDIA is seeking a Senior Applied AI Software Engineer to join their Dynamo project, an innovative open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. This role sits at the intersection of cutting-edge AI infrastructure and distributed systems engineering.

The position involves working on sophisticated challenges in distributed inference, including building Kubernetes deployment systems, developing scalable workload management solutions, and optimizing GPU resource utilization. You'll be responsible for architecting disaggregated serving systems, implementing dynamic GPU scheduling, and enhancing intelligent routing systems for efficient inference request handling.

As a senior engineer, you'll contribute to both the Python SDK and Rust Runtime Core Library, working with various LLM frameworks like TensorRT-LLM, vLLM, and SGLang. The role requires expertise in distributed systems, parallel computing, and GPU architectures, with hands-on experience in systems programming using Rust, Python, and Go.

NVIDIA offers a highly competitive compensation package, with base salary ranging from $148,000 to $287,500 USD, plus equity and comprehensive benefits. The company is known for being one of the technology world's most desirable employers, offering opportunities to work on transformative AI technologies that impact various industries.

The ideal candidate will have 5+ years of experience, strong technical skills in distributed systems and AI infrastructure, and a track record of contributing to open-source projects. This position offers the opportunity to shape the future of AI inference systems while working with a world-class team at a leading technology company.

Last updated 6 days ago

Responsibilities For Senior Applied AI Software Engineer, Distributed Inference Systems

  • Collaborate on design and development of the Dynamo Kubernetes stack
  • Introduce new features to the Dynamo Python SDK and Dynamo Rust Runtime Core Library
  • Design, implement, and optimize distributed inference components in Rust and Python
  • Contribute to disaggregated serving development
  • Improve intelligent routing and KV-cache management subsystems
  • Contribute to open-source repositories and participate in code reviews
  • Work with community to address issues and evolve framework
  • Write documentation and contribute to user/developer guides

Requirements For Senior Applied AI Software Engineer, Distributed Inference Systems

Python
Rust
Kubernetes
  • BS/MS or higher in computer engineering, computer science or related engineering
  • 5+ years of proven experience in related field
  • Strong proficiency in systems programming (Rust and/or C++), Python, and Go
  • Deep understanding of distributed systems, parallel computing, and GPU architectures
  • Experience with cloud-native deployment and container orchestration
  • Experience with large-scale inference serving, LLMs, or similar AI workloads
  • Background with memory management, data transfer optimization, and multi-node orchestration
  • Familiarity with open-source development workflows
  • Excellent problem-solving and communication skills

Benefits For Senior Applied AI Software Engineer, Distributed Inference Systems

Medical Insurance
Equity
  • Competitive salaries
  • Comprehensive benefits package
  • Equity

Jobs Related To NVIDIA Senior Applied AI Software Engineer, Distributed Inference Systems