Senior Site Reliability Engineer

Guidewire

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently. We provide software for Property and Casualty Insurance companies to handle core operations, data management, digital portals, and predictive analytics.

Bengaluru, Karnataka, India

Site Reliability

Senior Software Engineer

Hybrid

1,000 - 5,000 Employees

10+ years of experience

Finance · Enterprise SaaS

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Senior Site Reliability Engineer

At Guidewire, we create software for Property and Casualty (P&C) Insurance companies, helping them manage policies, claims, and billing. Our products run on the Guidewire Cloud Platform, serving hundreds of insurance providers globally. As a Senior Site Reliability Engineer, you'll join a team dedicated to automating and improving the reliability of Guidewire's systems. You'll work on ensuring the reliability of our flagship cloud platform and InsuranceSuite products, building tools for efficient operations and optimal availability of SaaS multi-tenant and customer-focused systems. This role requires strong collaboration, ownership, and problem-solving skills, especially with systems like AWS, Kubernetes, and Aurora. You'll work closely with core product developers to address functional and non-functional requirements such as availability, performance, observability, and maintainability. The ideal candidate should have a passion for automation, rapid self-learning, and experience with production support of SaaS platforms in cloud-native environments. This position offers the opportunity to make a significant impact on critical systems serving millions of transactions daily, working with cutting-edge technologies in a collaborative and innovative environment.

Last updated 10 months ago

Responsibilities For Senior Site Reliability Engineer

Collaborate with development teams to enhance reliability and efficiency of microservices applications
Participate in design reviews and production readiness checks
Analyze data from observability and monitoring tools to improve operational metrics
Create system documentation and training materials
Oversee and automate the team's growing presence in AWS
Build and maintain observability tooling, metrics, and dashboarding
Improve incident management lifecycle
Contribute code to the product when necessary

Requirements For Senior Site Reliability Engineer

Java

Python

Linux

Kubernetes

PostgreSQL

Bachelor's Degree in Computer Science or related field
10+ Years of experience
Software engineering and task automation skills with Bash, Python, and/or Go
Experience in developing and maintaining Java-based web applications
Deep background with Linux systems and engineering
Highly experienced with engineering and automating on Amazon Web Services (AWS)
Prior experience with IaC tools like Terraform/Terragrunt/Terraspace
Production-At-Scale support background in a microservice-based world
Hands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI and Ingress networking)
Strong understanding of Single-Sign On, SAML, OAuth
Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
Advanced exposure to application development, web UI, JSON, application architecture
Experience with observability tools like Datadog, CloudWatch, and PagerDuty
Familiarity with event store/stream-processing technologies like Kafka or AWS SQS
Understanding of Open Application Model systems such as KubeVela or Crossplane
Ability to read, write, and speak English
Willingness to be on-call for weekend production emergencies

Benefits For Senior Site Reliability Engineer

Travel opportunities for training and team meetings