Google's Site Reliability Engineering (SRE) team is seeking a talented Software Engineer to join their Cloud Platform Reliability team. This role combines software and systems engineering to build and maintain large-scale, distributed systems that power Google Cloud's services. As an SRE, you'll be responsible for ensuring the reliability and uptime of both internal and customer-facing systems while managing the challenges of scale unique to Google Cloud.
The position offers an opportunity to work with complex distributed systems, focusing on optimization, infrastructure development, and automation. You'll be part of a culture that values intellectual curiosity, problem-solving, and openness, working alongside diverse teammates with various backgrounds and perspectives.
The role involves hands-on coding, system design, and operational responsibilities. You'll contribute to maintaining and improving system reliability, capacity planning, and performance optimization. The team promotes self-direction while providing support and mentorship for professional growth.
Key aspects of the role include code development, peer review, documentation, system troubleshooting, and participating in technical design decisions. The ideal candidate will combine strong software development skills with an interest in large-scale system operations and reliability engineering.
Google offers a collaborative environment where you can make a significant impact on critical infrastructure while working with cutting-edge technology. The position provides opportunities for both technical depth and career growth within the SRE organization.