Taro Logo

Principal DGX Cloud AI Infrastructure Software Engineer

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.
Cloud
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
AI · Enterprise SaaS · Cloud

Job Description

NVIDIA, a pioneer in accelerated computing for over 25 years, is seeking a Principal DGX Cloud AI Infrastructure Software Engineer to join their innovative team. This role sits at the intersection of cloud computing and AI infrastructure, focusing on the DGX Cloud Lepton Marketplace platform. The position offers an opportunity to work with cutting-edge technology in GPU-optimized virtual machines and cloud infrastructure.

As a Principal Engineer, you'll be responsible for establishing crucial integrations with NVIDIA Cloud Partners, enabling global developers to access GPU-optimized virtual machines seamlessly. The role involves crafting sophisticated IaaS API integrations, developing stateful workflow orchestration, and ensuring high-quality, fault-tolerant solutions across diverse cloud environments.

NVIDIA's environment is perfect for those passionate about cloud infrastructure, kubernetes, distributed systems, and API development. The company's culture emphasizes innovation, technical excellence, and collaborative problem-solving. You'll be part of a team that's defining the next era of computing, where GPUs power the brains of computers, robots, and self-driving cars.

The ideal candidate brings 15+ years of experience in large-scale AI systems development, deep expertise in kubernetes and cloud infrastructure, and strong programming skills, particularly in Go. This role offers the chance to make a lasting impact on the world of AI and cloud computing while working for a global leader in accelerated computing technology.

Last updated 2 days ago

Responsibilities For Principal DGX Cloud AI Infrastructure Software Engineer

  • Work with DGX Cloud Lepton Marketplace team to establish integrations with NVIDIA Cloud Partners
  • Craft and implement IaaS API integrations
  • Collaborate with external engineering teams
  • Shape integration strategies
  • Develop stateful workflow orchestration
  • Drive improvements in testing, observability, and automation
  • Develop two-sided marketplace including integration of compute providers
  • Craft discovery and bidding experiences to match supply with demand

Requirements For Principal DGX Cloud AI Infrastructure Software Engineer

Kubernetes
Go
  • 15+ years of experience in developing software infrastructure for large-scale AI systems
  • Expertise in software engineering with kubernetes
  • Familiarity with cloud infrastructure environments (VMaaS, VPCs, RDMA, shared file-systems)
  • Proven ability to handle 3rd party API integrations
  • Comfort in a fast-paced environment
  • Strong technical knowledge including proficiency in systems programming (preference for Go)
  • BS in Computer Science, Engineering, Physics, Mathematics, or equivalent experience