Google Cloud's Site Reliability Engineering (SRE) team is seeking a Senior Software Engineer to join their mission of building and maintaining large-scale, distributed systems. This role combines software and systems engineering to ensure Google Cloud's services maintain optimal reliability and performance. As an SRE, you'll tackle complex scalability challenges unique to Google Cloud while leveraging your expertise in coding, algorithms, and system design.
The position offers an opportunity to work with Google's Technical Infrastructure team, which forms the backbone of Google's entire product portfolio. You'll be responsible for the full lifecycle of services, from initial design through deployment and ongoing maintenance. The role involves system design consulting, developing software platforms, capacity planning, and launch reviews.
The ideal candidate will have strong experience in distributed systems, demonstrable software development skills, and the ability to lead technical projects. You'll work in a culture that values diversity, intellectual curiosity, and problem-solving, where you'll collaborate with professionals from various backgrounds in a blame-free environment.
Key aspects of the role include monitoring system health, implementing automation for scalability, and participating in incident response. You'll be part of a team that's essential to keeping Google's networks running optimally, ensuring users have the best possible experience. This is an excellent opportunity for engineers who enjoy solving complex technical challenges while having a direct impact on Google Cloud's infrastructure reliability.