Senior Site Reliability Engineer

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently. We provide software for Property and Casualty Insurance companies to handle core operations, data management, digital portals, and predictive analytics.
Site Reliability
Senior Software Engineer
Hybrid
1,000 - 5,000 Employees
10+ years of experience
Finance · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Site Reliability Engineer - Apple Services Engineering

Senior Site Reliability Engineer position at Apple Services Engineering, focusing on large-scale storage infrastructure and data protection systems.

Software Engineer - Apple Services Engineering Storage SRE

Senior SRE position at Apple focusing on large-scale storage infrastructure, requiring Golang expertise and distributed systems knowledge.

Site Reliability Engineer (SRE) Specialist

Senior SRE position at Capco focusing on system reliability, cloud operations, and automation for financial services clients, offering competitive benefits and hybrid work model.

Site Reliability Engineer

Senior Site Reliability Engineer role at Glean, focusing on maintaining and scaling AI-powered enterprise search platform with competitive compensation and benefits.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.

Description For Senior Site Reliability Engineer

At Guidewire, we create software for Property and Casualty (P&C) Insurance companies, helping them manage policies, claims, and billing. Our products run on the Guidewire Cloud Platform, serving hundreds of insurance providers globally. As a Senior Site Reliability Engineer, you'll join a team dedicated to automating and improving the reliability of Guidewire's systems. You'll work on ensuring the reliability of our flagship cloud platform and InsuranceSuite products, building tools for efficient operations and optimal availability of SaaS multi-tenant and customer-focused systems. This role requires strong collaboration, ownership, and problem-solving skills, especially with systems like AWS, Kubernetes, and Aurora. You'll work closely with core product developers to address functional and non-functional requirements such as availability, performance, observability, and maintainability. The ideal candidate should have a passion for automation, rapid self-learning, and experience with production support of SaaS platforms in cloud-native environments. This position offers the opportunity to make a significant impact on critical systems serving millions of transactions daily, working with cutting-edge technologies in a collaborative and innovative environment.

Last updated 6 months ago

Responsibilities For Senior Site Reliability Engineer

  • Collaborate with development teams to enhance reliability and efficiency of microservices applications
  • Participate in design reviews and production readiness checks
  • Analyze data from observability and monitoring tools to improve operational metrics
  • Create system documentation and training materials
  • Oversee and automate the team's growing presence in AWS
  • Build and maintain observability tooling, metrics, and dashboarding
  • Improve incident management lifecycle
  • Contribute code to the product when necessary

Requirements For Senior Site Reliability Engineer

Java
Python
Go
Linux
Kubernetes
PostgreSQL
  • Bachelor's Degree in Computer Science or related field
  • 10+ Years of experience
  • Software engineering and task automation skills with Bash, Python, and/or Go
  • Experience in developing and maintaining Java-based web applications
  • Deep background with Linux systems and engineering
  • Highly experienced with engineering and automating on Amazon Web Services (AWS)
  • Prior experience with IaC tools like Terraform/Terragrunt/Terraspace
  • Production-At-Scale support background in a microservice-based world
  • Hands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI and Ingress networking)
  • Strong understanding of Single-Sign On, SAML, OAuth
  • Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
  • Advanced exposure to application development, web UI, JSON, application architecture
  • Experience with observability tools like Datadog, CloudWatch, and PagerDuty
  • Familiarity with event store/stream-processing technologies like Kafka or AWS SQS
  • Understanding of Open Application Model systems such as KubeVela or Crossplane
  • Ability to read, write, and speak English
  • Willingness to be on-call for weekend production emergencies

Benefits For Senior Site Reliability Engineer

  • Travel opportunities for training and team meetings

Interested in this job?