Senior Site Reliability Engineer

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently. We provide software for Property and Casualty Insurance companies to handle core operations, data management, digital portals, and predictive analytics.
Site Reliability
Senior Software Engineer
Hybrid
1,000 - 5,000 Employees
10+ years of experience
Finance · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Sr. System Reliability Engineer

Senior SRE position at Disney focusing on system reliability, automation, and infrastructure management for enterprise-scale applications.

Senior Reliability Engineer

Senior Reliability Engineer position at Natron Energy focusing on battery systems development and testing for data centers and EV applications.

Senior Site Reliability / Gitops Engineer

Senior Site Reliability Engineer position at Canonical, focusing on GitOps and infrastructure automation for Ubuntu's parent company.

Site Reliability Engineer

Senior SRE position at Radar, managing high-throughput infrastructure handling 1B+ daily API calls, using AWS, Kubernetes, and MongoDB, with competitive compensation and benefits.

Site Reliability Engineer, AI/ML Platforms

Senior Site Reliability Engineer role at Adobe focusing on AI/ML platforms, requiring 5+ years experience in distributed systems and containerization technologies.

Description For Senior Site Reliability Engineer

At Guidewire, we create software for Property and Casualty (P&C) Insurance companies, helping them manage policies, claims, and billing. Our products run on the Guidewire Cloud Platform, serving hundreds of insurance providers globally. As a Senior Site Reliability Engineer, you'll join a team dedicated to automating and improving the reliability of Guidewire's systems. You'll work on ensuring the reliability of our flagship cloud platform and InsuranceSuite products, building tools for efficient operations and optimal availability of SaaS multi-tenant and customer-focused systems. This role requires strong collaboration, ownership, and problem-solving skills, especially with systems like AWS, Kubernetes, and Aurora. You'll work closely with core product developers to address functional and non-functional requirements such as availability, performance, observability, and maintainability. The ideal candidate should have a passion for automation, rapid self-learning, and experience with production support of SaaS platforms in cloud-native environments. This position offers the opportunity to make a significant impact on critical systems serving millions of transactions daily, working with cutting-edge technologies in a collaborative and innovative environment.

Last updated 7 months ago

Responsibilities For Senior Site Reliability Engineer

  • Collaborate with development teams to enhance reliability and efficiency of microservices applications
  • Participate in design reviews and production readiness checks
  • Analyze data from observability and monitoring tools to improve operational metrics
  • Create system documentation and training materials
  • Oversee and automate the team's growing presence in AWS
  • Build and maintain observability tooling, metrics, and dashboarding
  • Improve incident management lifecycle
  • Contribute code to the product when necessary

Requirements For Senior Site Reliability Engineer

Java
Python
Go
Linux
Kubernetes
PostgreSQL
  • Bachelor's Degree in Computer Science or related field
  • 10+ Years of experience
  • Software engineering and task automation skills with Bash, Python, and/or Go
  • Experience in developing and maintaining Java-based web applications
  • Deep background with Linux systems and engineering
  • Highly experienced with engineering and automating on Amazon Web Services (AWS)
  • Prior experience with IaC tools like Terraform/Terragrunt/Terraspace
  • Production-At-Scale support background in a microservice-based world
  • Hands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI and Ingress networking)
  • Strong understanding of Single-Sign On, SAML, OAuth
  • Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
  • Advanced exposure to application development, web UI, JSON, application architecture
  • Experience with observability tools like Datadog, CloudWatch, and PagerDuty
  • Familiarity with event store/stream-processing technologies like Kafka or AWS SQS
  • Understanding of Open Application Model systems such as KubeVela or Crossplane
  • Ability to read, write, and speak English
  • Willingness to be on-call for weekend production emergencies

Benefits For Senior Site Reliability Engineer

  • Travel opportunities for training and team meetings

Interested in this job?