Taro Logo

Staff Software Engineer, Colossus Site Reliability Engineering

Google is a global technology company that builds innovative products and services used by billions of users worldwide.
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Staff Software Engineer, Colossus Site Reliability Engineering

Google's Site Reliability Engineering (SRE) team is seeking a Staff Software Engineer to join their Colossus Site Reliability Engineering team. This role combines software and systems engineering to build and maintain large-scale, distributed, fault-tolerant systems. As an SRE, you'll ensure Google Cloud's services maintain reliability and appropriate uptime while continuously improving performance and capacity.

The position requires deep expertise in distributed systems, with a focus on optimizing existing systems, building infrastructure, and implementing automation. You'll be working on unique scale challenges specific to Google Cloud, applying your knowledge of coding, algorithms, and complex system design. The role involves both pre-deployment activities like system design consulting and post-deployment responsibilities including monitoring system health and implementing improvements.

The Technical Infrastructure team, which this role is part of, is fundamental to Google's product portfolio, developing and maintaining data centers and building next-generation Google platforms. The team takes pride in their engineering excellence and innovative problem-solving approach.

This is an excellent opportunity for experienced engineers who are passionate about large-scale systems, have strong leadership capabilities, and want to work on technology that impacts billions of users. The role offers the chance to work in a culture that values intellectual curiosity and collaboration, with opportunities to tackle complex technical challenges while growing professionally.

The position comes with Google's comprehensive benefits package and the opportunity to work with some of the industry's brightest minds in a company known for its technical innovation and global impact. You'll be part of a team that promotes self-direction while providing the support and mentorship needed for continuous learning and growth.

Last updated a month ago

Responsibilities For Staff Software Engineer, Colossus Site Reliability Engineering

  • Engage and improve the lifecycle of service from inception and design, through to deployment, operation and refinement
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Practice sustainable incident response and postmortems

Requirements For Staff Software Engineer, Colossus Site Reliability Engineering

Java
Linux
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 8 years of experience with software development in one or more programming languages
  • 3 years of experience in leading projects
  • 3 years of experience in designing, analyzing, and troubleshooting distributed systems
  • Experience with working in computing, distributed systems, storage, or networking
  • Experience in Java or C/C++
  • Excellent investigative, problem-solving and communication skills

Benefits For Staff Software Engineer, Colossus Site Reliability Engineering

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Comprehensive health benefits
  • Retirement plans
  • Parental leave
  • Professional development opportunities
  • Competitive compensation package