Lead Site Reliability Engineer

A leading global financial services company providing banking, investments, and other financial solutions.
Charlotte, NC, USA
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Finance

Description For Lead Site Reliability Engineer

Wells Fargo is seeking a Lead Site Reliability Engineer to join their Wealth and Investment Management Technology team. This role is perfect for someone who thinks systematically about reliability and can translate business requirements into technical implementations. The position sits at the crucial intersection of software engineering and operations, where you'll be applying engineering principles to solve complex infrastructure challenges.

As an SRE lead, you'll be responsible for designing and implementing scalable systems, creating observability solutions, and developing automation to enhance platform reliability. The role involves working closely with developers and business stakeholders to maintain high system availability and reliability. You'll lead incident response efforts, conduct post-mortem analyses, and implement preventive measures to avoid future issues.

The ideal candidate brings 5+ years of experience in both Technology Infrastructure Engineering and Site Reliability Engineering. You should have strong expertise in REST APIs, monitoring tools like Splunk and AppDynamics, and database systems including MongoDB and Oracle. Your technical skills should be complemented by excellent communication abilities and strong problem-solving capabilities.

This position offers the opportunity to work with a leading financial institution, making a significant impact on system reliability and performance. You'll be part of a diverse team environment, participating in on-call rotations to ensure 24/7 system availability. The role provides competitive benefits including medical, dental, and vision insurance, along with opportunities for professional growth in a stable, respected organization.

Last updated 24 minutes ago

Responsibilities For Lead Site Reliability Engineer

  • Work alongside developers and business stakeholders to automate acceptance criteria
  • Maintain high reliability and availability for software applications
  • Automate mundane tasks to avoid human errors
  • Define SLI & SLO by collaborating with Product owners
  • Lead incident response efforts and post-mortem analysis
  • Write incident root cause analysis
  • Document procedures, best practices and troubleshooting FAQs
  • Debug systems and fix production related issues
  • Handle complex operational tasks
  • Provide global support including troubleshooting
  • Participate in on-call rotations for 24/7 system support

Requirements For Lead Site Reliability Engineer

MongoDB
Linux
Kubernetes
  • 5+ years of Technology Infrastructure Engineering and Solutions experience
  • 5+ years of Site Reliability Engineering experience
  • Strong understanding of REST APIs
  • Experience with troubleshooting tools (Splunk, AppDynamics, Elastic APM)
  • Experience with API Management tools like Apigee
  • Working knowledge of databases (MongoDB, Oracle)
  • Strong foundation in reliability engineering principles
  • Experience defining and implementing SLOs/SLIs
  • Experience with observability solutions
  • Strong incident response skills
  • Excellent problem-solving abilities
  • Strong communication skills
  • Ability to work weekends
  • Ability to work both independently and collaboratively

Benefits For Lead Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
  • Equal opportunity employer
  • Medical, dental, and vision insurance
  • Accommodation for applicants with disabilities

Interested in this job?

Jobs Related To Wells Fargo Lead Site Reliability Engineer

Lead Platform Engineer, Site Reliability Engineering

Lead Platform Engineer role at Mastercard focusing on Site Reliability Engineering to ensure excellent customer experiences through infrastructure and service optimization.

Senior Site Reliability Developer 3

Senior Site Reliability Developer role at Oracle Cloud Infrastructure, focusing on database engineering and cloud services optimization.

Senior Site Reliability Development Engineer

Senior Site Reliability Engineer role at Oracle Cloud Infrastructure focusing on government and sovereign cloud operations, requiring 6-10+ years of experience in cloud operations and infrastructure.

Staff Software Engineer, Reliability Engineer – Store Systems & Services

Staff Software Engineer, Reliability Engineer position at Home Depot focusing on store systems and services in a remote work environment.

Senior Site Reliability Engineer, Enterprise Cloud Platforms, Global Technology, Australia

Senior Site Reliability Engineer role at Bank of America focusing on cloud platform development and maintenance, requiring 15 years of experience in SRE or related fields.