NVIDIA, a pioneer in accelerated computing and GPU technology, is seeking a Senior AI-HPC Cluster Engineer to join their MLOps team. This role sits at the intersection of high-performance computing and artificial intelligence, focusing on building and optimizing large-scale GPU compute clusters for deep learning and HPC workloads.
The position offers an opportunity to work with cutting-edge technology at a company that has been transforming computer graphics and accelerated computing for over 25 years. As an NVIDIAN, you'll be responsible for designing and implementing GPU compute clusters, developing scalable automation solutions, and supporting researchers in optimizing their workloads.
The ideal candidate brings 6+ years of experience in large-scale compute infrastructure, strong programming skills in both scripting (Python, Bash) and compiled languages (Go, Rust, C++), and deep expertise in Linux systems and container technologies. Knowledge of AI/HPC job schedulers, performance tuning, and distributed systems is essential.
This role offers competitive compensation with a base salary ranging from $184,000 to $356,500 USD (depending on level), plus equity and comprehensive benefits. You'll be joining a diverse, supportive environment where innovation is celebrated and you can make a lasting impact on the world of AI and high-performance computing.
The position offers flexibility with locations in major tech hubs like Santa Clara, CA and Austin, TX, with hybrid work options available. You'll be working with state-of-the-art technology including NVIDIA GPUs, CUDA programming, and modern ML frameworks, while collaborating with talented researchers and engineers to push the boundaries of what's possible in AI and HPC.