NVIDIA is seeking a Senior AI Infrastructure Engineer to join their DGX Cloud SRE group, focusing on designing and maintaining large-scale production systems. This role combines software and systems engineering practices, requiring expertise in systems, networking, coding, database management, and cloud technologies. The position is part of NVIDIA's DGX Cloud SRE team, ensuring reliable GPU cloud services for both internal and external users.
The role demands a strong background in infrastructure automation, distributed systems design, and experience with modern cloud technologies like Kubernetes and OpenStack. The ideal candidate will have 5+ years of experience and a BS in Computer Science or related field, with expertise in languages like Python, Go, C/C++, or Java. Knowledge of Linux, networking, and container technologies is essential.
NVIDIA offers a competitive compensation package with a base salary range of $148,000 - $287,500 USD, plus equity and benefits. The company is known for its innovative work in AI, High-Performance Computing, and Visualization, with the GPU being their groundbreaking invention. They promote a culture of diversity, intellectual curiosity, and problem-solving, encouraging collaboration and risk-taking in a blame-free environment.
This position offers the opportunity to work on meaningful projects while receiving support and mentorship for professional growth. The role involves being part of a team that ensures maximum reliability and uptime of GPU cloud services while managing system changes, capacity, and performance. The work environment is dynamic and forward-thinking, perfect for creative and autonomous professionals passionate about advancing technology.