We are seeking a highly skilled and experienced Large GPU Cluster Performance and Benchmark Engineer to join our advanced technology team as a Senior Principal. This role focuses on designing, optimizing, and benchmarking large-scale GPU clusters, specifically running MLPerf benchmarks from MLCommons across thousands of NVIDIA and AMD GPUs. The position offers an opportunity to be at the forefront of GPU performance benchmarking and large-scale infrastructure design, working with cutting-edge technology and a highly skilled team.
The role involves leading performance optimization for AI/ML workloads, conducting comprehensive STAC benchmarks, and architecting solutions for high-performance computing environments. You'll collaborate with cross-functional teams to develop innovative GPU cluster architectures while serving as a technical thought leader in the field.
As part of Oracle, a world leader in cloud solutions, you'll have access to extensive resources and opportunities to work on challenging projects that push the boundaries of technology. The position offers competitive compensation ($96,800 - $251,600) and comprehensive benefits including medical, dental, vision insurance, 401(k) with company match, flexible vacation, and parental leave.
The ideal candidate will bring 10+ years of experience in GPU cluster architecture and benchmarking, strong programming skills, and expertise in container orchestration and cloud infrastructure. You'll need to demonstrate exceptional analytical abilities and strong communication skills to succeed in this collaborative environment.
This role represents an excellent opportunity for an experienced professional looking to make significant contributions to the advancement of GPU cluster performance and AI/ML infrastructure at a global technology leader. Join us in shaping the future of cloud computing and artificial intelligence technologies.