Taro Logo

System Software Engineer, Platform Compute

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology.
$168,000 - $322,000
DevOps
Staff Software Engineer
Hybrid
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Job Description

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a System Software Engineer for their Platform Compute team. This role is crucial in managing and scaling their multi-cloud training delivery platform that spans across 3-4 cloud service providers and approximately 50 regions. The position offers a unique opportunity to work on systems that enable AI learning and development at a massive scale.

The role combines DevOps expertise with platform engineering, requiring deep knowledge of cloud infrastructure, containerization, and automation. You'll be responsible for ensuring 24/7 operation of critical training infrastructure while optimizing costs and preventing compute capacity shortages. This is particularly important as the platform faces potential 10x increase in training demand.

As a core member of the learning systems platform team, you'll work alongside experts and educators to create scalable, reliable learning experiences. The position involves building and maintaining sophisticated cloud infrastructure using technologies like Kubernetes, Terraform, and Python, while working with multiple cloud providers including AWS, Azure, and GCP.

The ideal candidate will bring 8+ years of DevOps experience, strong technical skills in cloud infrastructure, and excellent problem-solving abilities. You'll be working on cutting-edge AI learning platforms, making advanced technologies accessible to learners worldwide. The role offers competitive compensation, including a base salary range of $168,000 - $322,000 (depending on level), equity, and comprehensive benefits.

NVIDIA provides an exceptional work environment, consistently ranked as one of the most desirable employers in the technology sector. This position offers the opportunity to make a significant impact on how people learn and apply AI technologies, while working with some of the industry's most innovative minds in a rapidly growing field.

Last updated 9 hours ago

Responsibilities For System Software Engineer, Platform Compute

  • Building systems to support maintenance, scaling, and operation of diverse global compute platforms across multiple cloud providers
  • Driving continuous cost optimization for compute resources
  • Designing and implementing flexible solutions for compute capacity and resource availability
  • Building, maintaining, and optimizing orchestration functions
  • Managing and maintaining artifacts for consistent baseline compute capability

Requirements For System Software Engineer, Platform Compute

Python
Kubernetes
Linux
  • Bachelor's degree in Computer Science, related technical field, or equivalent experience
  • 8+ years of DevOps experience with containerized applications
  • Experience in building scalable, reliable services and distributed system integration
  • Hands-on experience maintaining AWS security groups, roles, IAM
  • Proficiency in Python and Linux shell scripting
  • Experience with Terraform for cloud infrastructure
  • Strong problem-solving and analytical skills
  • Excellent communication and teamwork skills

Benefits For System Software Engineer, Platform Compute

Equity
  • Competitive salaries
  • Equity
  • Comprehensive benefits package

Related Jobs

Senior System Software Engineering Lead for Release - Base OS

Senior System Software Engineering Lead position at NVIDIA, focusing on Base OS development and release management, offering $184K-$287.5K and hybrid work options.

Senior Network Automation Architect - DGX Cloud

Senior Network Automation Architect role at NVIDIA focusing on Kubernetes cluster automation and network infrastructure for DGX Cloud, offering competitive compensation and the opportunity to work with cutting-edge technology.

Lead System Software Engineer, CPU and GPU Performance Visualization Tools

Lead System Software Engineer role at NVIDIA focusing on CPU and GPU performance visualization tools development, requiring 12+ years of experience.

Diagnostics Software Infrastructure Engineer

Lead DevOps role at NVIDIA focusing on GPU software development lifecycle, build infrastructure, and cross-team coordination for diagnostics software teams.

Member of Technical Staff - Data Infrastructure Engineer (DevOps|SRE|Platform Engineering|MLOps)

Microsoft seeks a Staff Data Infrastructure Engineer to build and maintain scalable AI systems, combining DevOps, SRE, and MLOps practices. NYC-based role offering $158-258K.