NVIDIA is seeking an experienced Senior DGX Cloud Software Engineer to join their Infrastructure Automation and Distributed Systems team. This role is central to supporting NVIDIA's AI training and inference development initiatives through the DGX Cloud platform. The position offers a unique opportunity to work with cutting-edge technology in AI and cloud infrastructure at one of technology's most innovative companies.
The role involves building and maintaining large-scale private and public cloud systems, with a focus on bare-metal, accelerated compute infrastructure. You'll be working with advanced technologies including BlueField Networking, Infiniband topologies, and NVIDIA's Collective Communication Library (NCCL). The position requires strong expertise in cloud infrastructure, distributed systems, and automation at scale.
As a senior engineer, you'll be responsible for designing and implementing cloud infrastructure services, participating in defining service level objectives, and maintaining high reliability standards. The role includes on-call responsibilities and requires a collaborative approach to problem-solving and system design.
NVIDIA offers competitive compensation with a base salary range of $144,000 - $270,250 USD (depending on level), plus equity and comprehensive benefits. The company is known for its innovative culture and commitment to pushing technological boundaries in AI, High-Performance Computing, and Visualization.
The ideal candidate will have 5+ years of relevant experience, strong programming skills in Python or Go, and deep knowledge of cloud technologies like Kubernetes and Linux. Experience with ML/AI systems is a plus but not required. This role offers the opportunity to work on challenging problems at scale while contributing to NVIDIA's mission of accelerating the next wave of artificial intelligence.