Microsoft's Azure Data engineering team is seeking a Site Reliability Engineer II to join their databases team, focusing on operational Database systems. This role is part of Azure Cosmos DB, Microsoft's globally distributed, massively scalable, multi-model cloud database service.
As an SRE II, you'll be responsible for maintaining and improving service reliability for one of Azure's fastest-growing services. The position involves working with critical systems in Healthcare, Retail, Telecommunications, and IoT, where service availability and latency are paramount. Azure Cosmos DB provides financially backed SLAs of 99.99% availability and <10ms latency.
Key responsibilities include:
The role offers competitive compensation ($98,300 - $193,200 base salary range) and comprehensive benefits including healthcare, educational resources, and parental leave. This is a remote-friendly position with up to 100% work from home flexibility and 0-25% travel requirements.
The ideal candidate will bring 4+ years of technical experience in software engineering or systems administration, with specific expertise in SRE practices and cloud services. You'll join a diverse team that values different perspectives and operates with a startup mindset while having the resources and impact of a global technology leader.
This is an excellent opportunity for someone passionate about service reliability, automation, and working with cutting-edge cloud technology at scale. You'll be at the forefront of building and shaping the Livesite Automation and AI Ops stack in Cosmos DB, leading the path for broader adoption across Microsoft Azure.