Microsoft's Azure Data engineering team is seeking a Principal Site Reliability Engineer to join their mission of building the data platform for the age of AI. This role is part of the databases team that builds and maintains Microsoft's operational Database systems.
As a Principal SRE, you'll be responsible for taking a data-driven approach to solve Service Reliability problems. You'll analyze massive amounts of telemetry and Service Health indicators in near real-time, perform automated root cause analysis, and implement necessary mitigations to restore SLOs. The role involves close collaboration with engineering teams to enhance tooling and automation solutions for faster issue resolution.
Key responsibilities include:
The position offers competitive compensation with a base pay range of $139,900 - $274,800 (higher in SF Bay Area and NYC). Microsoft provides comprehensive benefits including healthcare, educational resources, savings plans, parental leave, and more.
This is an excellent opportunity for an experienced SRE to make a significant impact on Microsoft's critical database infrastructure while working with cutting-edge cloud technologies and AI-enabled systems. The role offers a blend of technical challenge, customer interaction, and strategic influence on product development.