Taro Logo

Senior Site Reliability Engineer

Largest business automation cloud platform provider offering CRM and enterprise cloud solutions
$172,000 - $236,500
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Enterprise SaaS · Cloud

Description For Senior Site Reliability Engineer

Salesforce, the world's largest business automation cloud platform, is seeking a Senior Site Reliability Engineer to join their infrastructure team. This role is crucial in managing and evolving their multi-substrate Kubernetes and microservices platform that powers their Core CRM and growing suite of applications. The position offers an opportunity to work with cutting-edge cloud-native technologies and AI-driven operational practices, focusing on building highly reliable, self-healing, and scalable service mesh systems.

The role involves managing high availability for microservices across 1000+ clusters, implementing monitoring solutions, and driving automation efforts. You'll work with modern technologies including Kubernetes, Docker, service mesh, and various cloud-native tools. The position requires strong technical expertise in systems engineering, cloud platforms, and programming languages like Python and Go.

This is an excellent opportunity for an experienced SRE professional looking to impact large-scale systems. You'll be part of a highly innovative team, working on critical infrastructure that supports thousands of internal developers and tens of thousands of customers. The role offers exposure to complex distributed systems, cutting-edge cloud technologies, and the chance to solve challenging technical problems at scale.

Working at Salesforce means joining a company at the forefront of enterprise cloud computing, with a strong focus on innovation and technical excellence. The role provides opportunities for professional growth, working with talented engineers, and contributing to systems that power some of the world's largest businesses. If you're passionate about reliability, automation, and building robust cloud infrastructure, this position offers the perfect blend of challenge and opportunity.

Last updated 20 hours ago

Responsibilities For Senior Site Reliability Engineer

  • Ensure high availability for microservices supporting service mesh and ingress gateway on 1000+ clusters
  • Contribute code to drive service availability improvement
  • Implement monitoring and metrics with Prometheus, Grafana and other frameworks
  • Drive automation efforts in Python/Golang/Puppet/Jenkins
  • Improve CI/CD pipelines built on Terraform, Spinnaker and Argo
  • Implement AIOps automation, monitoring and self-healing mechanisms
  • Collaborate with various Infrastructure teams across Salesforce
  • Evaluate new technologies to solve problems

Requirements For Senior Site Reliability Engineer

Kubernetes
Go
Python
Linux
Redis
  • 3+ years of experience in SRE/Devops/Systems Engineering roles
  • Experience operating large scale cluster management systems
  • Strong working experience with Kubernetes, Docker, Container Orchestration, Service Mesh, Ingress Gateway
  • Good knowledge with network technologies (TCP/IP, DNS, TLS termination, HTTP proxies, Load Balancers)
  • Excellent troubleshooting skills
  • Strong Experience in Observability tools like Prometheus, Grafana, Splunk, ElasticSearch
  • Strong working experience with Linux Systems Administration
  • Good experience in scripting/programming languages: Python, GoLang
  • Experience with AWS, Terraform, Spinnaker, ArgoCD
  • Excellent problem-solving, analytical and communication skills

Interested in this job?

Jobs Related To Salesforce Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Salesforce, responsible for maintaining and improving the reliability and performance of Salesforce's cloud infrastructure.

Senior Site Reliability Engineer (US Shift)

Senior Site Reliability Engineer position at AlphaSense, working remotely from India to ensure platform reliability and support development teams during US hours.

Operations Site Reliability Engineer

Senior Site Reliability Engineer role at Broadcom focusing on maintaining and optimizing production services, automation, and system administration.

Senior Software Engineer, Site Reliability Tooling

Senior SRE Engineer role at Upstart focusing on building and maintaining tooling for site reliability, monitoring, and automation in a fintech environment.

Senior Software Engineer - Site Reliability Engineering

Senior SRE position at Roblox focusing on building reliable, scalable systems and tooling to support millions of daily users. Hybrid role in San Mateo, CA with competitive compensation.