Taro Logo

Site Reliability Engineer (SRE) – Infrastructure and Observability

A leading global payments company that provides payment processing and technology solutions.
Cincinnati, OH, USA
Site Reliability
Mid-Level Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Finance

Job Description

Join Worldpay, a leading global payments company, as a Site Reliability Engineer (SRE) in their Technology Services Operations team. This role combines software engineering with systems engineering to enhance platform reliability and performance. You'll be instrumental in preventing incidents, automating operations, and improving observability across complex environments that support innovative fintech products.

As an SRE, you'll work with cutting-edge technologies and tools like Splunk, Prometheus, and Kubernetes to ensure system reliability. You'll collaborate with cross-functional teams to implement scalable solutions and drive continuous improvement initiatives. The role offers a unique opportunity to shape a high-performing SRE function from the ground up, with strong executive support and a clear roadmap for success.

The position requires expertise in incident management, observability, and automation, with hands-on experience in tools like Splunk, OpenTelemetry, and various monitoring solutions. You'll be part of a dynamic team that keeps Worldpay's technology infrastructure running smoothly, enabling millions of payment transactions worldwide.

This is an excellent opportunity for a mid-level engineer with 3+ years of experience who wants to work at the intersection of software development and operations. You'll be part of a company that values innovation, collaboration, and technical excellence, with the chance to make a significant impact on global payment systems. The role offers exposure to financial technology, high-availability environments, and modern cloud-native architectures.

Last updated 2 days ago

Responsibilities For Site Reliability Engineer (SRE) – Infrastructure and Observability

  • Analyze incident data from platforms like ServiceNow, IBM Netcool, and PagerDuty
  • Collaborate with teams to improve platform availability, stability, and performance
  • Identify and close observability gaps in logging, monitoring, and alerting
  • Integrate pre- and post-change validation testing into CI/CD pipelines
  • Develop automated runbooks for common incident types
  • Participate in Change Advisory Boards and root cause analysis processes
  • Contribute to monthly retrospectives and quarterly SRE health reports

Requirements For Site Reliability Engineer (SRE) – Infrastructure and Observability

Python
Java
JavaScript
Kubernetes
  • 3+ years of experience in Site Reliability Engineering, DevOps, or related technical role
  • Strong understanding of incident management and service reliability principles
  • Experience in IT Operations, with focus on observability and log management
  • Experience with Splunk Enterprise, Splunk Cloud, Prometheus, Grafana, Zabbix
  • Proficiency in scripting languages (Python, Bash) and infrastructure-as-code tools
  • Experience developing Splunk queries and dashboards using SPL
  • Familiarity with CI/CD pipelines and automated testing frameworks
  • Strong communication skills and ability to work effectively across teams

Related Jobs

Site Reliability Engineer

Site Reliability Engineer position at Global Payments, focusing on API operations and infrastructure management with hybrid work options in multiple US locations.

Site Reliability Engineer II - CTJ - Top Secret

Site Reliability Engineer II position at Microsoft working on Defender security products for government clouds, requiring Top Secret clearance and offering competitive compensation with comprehensive benefits.

Site Reliability Engineer

Site Reliability Engineer position at Global Payments, focusing on maintaining and improving API operations and system reliability for a leading payment processing company.

Site Reliability Engineer II - CTJ - Poly

Microsoft is hiring a Site Reliability Engineer II for their Identity team to support Azure Government Secret and Top-Secret Clouds, offering hybrid work and comprehensive benefits.

Software Developer II, Site Reliability

Site Reliability Developer position at Google focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.