NVIDIA is seeking an experienced Senior HPC AI Cluster Engineer to join their E2E software verification HPC/AI Infrastructure team. This role combines cutting-edge technology with the opportunity to work on supercomputers and HPC clusters using groundbreaking technologies. The position offers a unique chance to contribute to the latest developments in artificial intelligence and GPU computing.
As a Senior HPC AI Cluster Engineer, you'll be responsible for designing and implementing large-scale HPC/AI clusters, managing complex workload schedules, and developing automation tools for infrastructure management. You'll work with state-of-the-art accelerated computing and Deep Learning platforms, collaborating with scientific researchers, developers, and customers to improve workflows and create innovative solutions.
The ideal candidate brings 5+ years of experience and deep expertise in HPC environments, including knowledge of both hardware and software aspects of high-performance computing. You'll need strong skills in Python programming, Linux systems, and modern orchestration tools like Kubernetes and Slurm. Experience with storage solutions, networking protocols, and cloud platforms is essential.
NVIDIA offers a compelling opportunity to work at the forefront of AI and accelerated computing technology. The company provides competitive compensation and benefits, promoting a diverse and inclusive work environment. This remote position allows you to work from various European locations while contributing to projects that are shaping the future of computing technology.
Join NVIDIA to be part of a team that's driving innovation in AI, GPU computing, and high-performance computing, working with the latest technologies and solving complex technical challenges that impact multiple industries.