NVIDIA, the world leader in accelerated computing, is seeking a Senior Site Reliability Engineer to join their Infrastructure, Planning and Processes organization. This role is part of a dynamic team that develops and maintains sophisticated build & test environments for various hardware platforms including NVIDIA GPUs and Tegra Processors across multiple operating systems. The position involves working with diverse business units including Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence, Robotics, and Driverless Cars.
The ideal candidate will be responsible for implementing and managing Kubernetes architectures, developing automation tools, and ensuring high availability of systems. They will work with cutting-edge technologies in cloud infrastructure, containerization, and DevOps practices. The role requires strong expertise in programming (Python/Go), infrastructure as code, and modern monitoring solutions.
This is an excellent opportunity for an experienced SRE professional to work with state-of-the-art technology at a company that's driving innovation in AI, gaming, and autonomous vehicles. The position offers competitive compensation and benefits, working alongside some of the most forward-thinking professionals in the technology industry. The role combines technical depth with the opportunity to impact critical infrastructure supporting NVIDIA's groundbreaking work in accelerated computing.