Microsoft Azure AI/HPC team is seeking a Principal Supercomputing Software Engineer to enable customers in deploying, monitoring, profiling, and debugging applications on hyperscale cloud infrastructure. This role is critical in building and maintaining Azure's largest supercomputing deployments, which have achieved recognition in Top500, MLPerf, and Graph500 rankings.
As a Principal Supercomputing Engineer, you'll be responsible for developing state-of-the-art tools and techniques for managing cloud-native supercomputers at scale. You'll work on maintaining system reliability, runtime performance, and health monitoring while meeting customer SLAs. The role involves establishing best practices, driving architectural changes, and influencing the roadmap of software and hardware components.
The position offers a competitive base salary range of $137,600 - $267,000 (higher in SF Bay Area and NYC), along with comprehensive benefits including healthcare, educational resources, and investment options. This is a remote-friendly role with 0-25% travel requirements.
The ideal candidate will bring deep expertise in HPC systems, cloud infrastructure, and software engineering. You'll be part of Microsoft's mission to empower every person and organization globally, working in a culture that values growth mindset, innovation, and collaboration.
This is an exceptional opportunity for a seasoned professional to impact the future of AI and HPC in the cloud, working with cutting-edge technology at unprecedented scale. Your work will directly influence a wide range of users and drive the next wave of innovation in cloud supercomputing.