Microsoft Azure AI/HPC team is seeking a Principal Supercomputing Software Engineer to enable customers in deploying, monitoring, profiling, and debugging applications on hyperscale cloud infrastructure. This role is critical in building and maintaining cloud-native supercomputers, requiring specialized tools and techniques to ensure system reliability and performance.
The position involves working with state-of-the-art tools, establishing best practices, driving architectural changes, and influencing the roadmap of software and hardware components. You'll be directly impacting Azure's supercomputing capabilities, which have already made marks on Top500, MLPerf, and Graph500 rankings.
As a Principal Engineer, you'll be responsible for maintaining system reliability, runtime performance, and health while meeting customer SLAs. The role requires deep expertise in HPC systems, cloud infrastructure, and software development. You'll work closely with customers, vendors, and internal teams to drive comprehensive solutions for operating world-class supercomputers in the public cloud.
This is an excellent opportunity for someone passionate about large-scale computing systems and cloud technology. The position offers competitive compensation ($137,600 - $267,000), comprehensive benefits, and the chance to work on cutting-edge technology that powers some of the world's largest supercomputing deployments. Working at Microsoft, you'll be part of a culture that emphasizes growth mindset, innovation, and collaboration to achieve shared goals.