Google's Site Reliability Engineering (SRE) team combines software and systems engineering to build and maintain large-scale, distributed, fault-tolerant systems. As an SRE II, you'll be responsible for ensuring Google Cloud's services maintain optimal reliability and uptime while managing capacity and performance. The role involves complex problem-solving at scale, unique to Google Cloud's infrastructure.
The position requires strong coding skills, understanding of algorithms, and system design capabilities. You'll work on optimizing existing systems, building infrastructure, and creating automation solutions. The team values diversity, intellectual curiosity, and collaborative problem-solving in a blame-free environment.
Google offers a supportive environment for learning and growth, encouraging self-direction while providing necessary mentorship. The role presents an opportunity to work on meaningful projects that directly impact Google's critical infrastructure and customer-facing services.
The ideal candidate should have experience with distributed systems, a strong foundation in computer science, and a passion for system reliability and automation. You'll be part of a team that manages some of the world's largest computing systems while contributing to Google's engineering excellence.
Working at Google means joining a company committed to diversity, equal opportunity, and creating a culture of belonging. The role offers the chance to work with cutting-edge technology while collaborating with some of the industry's brightest minds in system reliability and infrastructure engineering.