Site Reliability Engineer (SRE) – Infrastructure and Observability

Worldpay

A leading global payments company that provides payment processing and technology solutions.

Cincinnati, OH, USA

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Finance

Job Description

Join Worldpay, a leading global payments company, as a Site Reliability Engineer (SRE) in their Technology Services Operations team. This role combines software engineering with systems engineering to enhance platform reliability and performance. You'll be instrumental in preventing incidents, automating operations, and improving observability across complex environments that support innovative fintech products.

As an SRE, you'll work with cutting-edge technologies and tools like Splunk, Prometheus, and Kubernetes to ensure system reliability. You'll collaborate with cross-functional teams to implement scalable solutions and drive continuous improvement initiatives. The role offers a unique opportunity to shape a high-performing SRE function from the ground up, with strong executive support and a clear roadmap for success.

The position requires expertise in incident management, observability, and automation, with hands-on experience in tools like Splunk, OpenTelemetry, and various monitoring solutions. You'll be part of a dynamic team that keeps Worldpay's technology infrastructure running smoothly, enabling millions of payment transactions worldwide.

This is an excellent opportunity for a mid-level engineer with 3+ years of experience who wants to work at the intersection of software development and operations. You'll be part of a company that values innovation, collaboration, and technical excellence, with the chance to make a significant impact on global payment systems. The role offers exposure to financial technology, high-availability environments, and modern cloud-native architectures.

Last updated 2 days ago

Responsibilities For Site Reliability Engineer (SRE) – Infrastructure and Observability

Analyze incident data from platforms like ServiceNow, IBM Netcool, and PagerDuty
Collaborate with teams to improve platform availability, stability, and performance
Identify and close observability gaps in logging, monitoring, and alerting
Integrate pre- and post-change validation testing into CI/CD pipelines
Develop automated runbooks for common incident types
Participate in Change Advisory Boards and root cause analysis processes
Contribute to monthly retrospectives and quarterly SRE health reports

Requirements For Site Reliability Engineer (SRE) – Infrastructure and Observability

Python

Java

JavaScript

Kubernetes

3+ years of experience in Site Reliability Engineering, DevOps, or related technical role
Strong understanding of incident management and service reliability principles
Experience in IT Operations, with focus on observability and log management
Experience with Splunk Enterprise, Splunk Cloud, Prometheus, Grafana, Zabbix
Proficiency in scripting languages (Python, Bash) and infrastructure-as-code tools
Experience developing Splunk queries and dashboards using SPL
Familiarity with CI/CD pipelines and automated testing frameworks
Strong communication skills and ability to work effectively across teams

Worldpay

A leading global payments company that provides payment processing and technology solutions.

Cincinnati, OH, USA

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Finance

Related Jobs

Site Reliability Engineer

Global Payments

Site Reliability Engineer position at Global Payments, focusing on API operations and infrastructure management with hybrid work options in multiple US locations.

Site Reliability Engineer II - CTJ - Top Secret

Microsoft

Site Reliability Engineer II position at Microsoft working on Defender security products for government clouds, requiring Top Secret clearance and offering competitive compensation with comprehensive benefits.

Site Reliability Engineer

Global Payments

Site Reliability Engineer position at Global Payments, focusing on maintaining and improving API operations and system reliability for a leading payment processing company.

Site Reliability Engineer II - CTJ - Poly

Microsoft

Microsoft is hiring a Site Reliability Engineer II for their Identity team to support Azure Government Secret and Top-Secret Clouds, offering hybrid work and comprehensive benefits.

Software Developer II, Site Reliability

Google

Site Reliability Developer position at Google focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.