Taro Logo

Lead Site Reliability Engineer

Salesforce is the Customer Company, providing AI + Data + CRM solutions to help companies connect with customers in innovative ways.
$200,800 - $276,100
Site Reliability
Staff Software Engineer
Hybrid
5,000+ Employees
8+ years of experience
Enterprise SaaS

Description For Lead Site Reliability Engineer

Salesforce is seeking a Lead Site Reliability Engineer to join their Marketing Automation Platform & Data Operations team within the Marketing Technology organization. This role is crucial in ensuring the reliability and operational efficiency of Salesforce's critical Marketing Technology ecosystem. The position requires an experienced engineer who will bridge software engineering and system administration, with particular focus on monitoring, visualization, and alerting tools.

The ideal candidate will take ownership of service reliability, lead incident investigations, and drive automation initiatives to enhance system stability. They will work with various monitoring and visualization platforms including Datadog, Splunk, Grafana, and New Relic, while managing reliability within the Salesforce ecosystem including Slack, Data Cloud, Tableau, and Heroku.

Key responsibilities include managing cloud infrastructure, implementing Infrastructure as Code, maintaining CI/CD pipelines, and leading incident response efforts. The role requires expertise in scripting languages like Python, Go, and Java, along with strong experience in cloud platforms (AWS, Azure, GCP) and tools like Terraform and Kubernetes.

The position offers competitive compensation ranging from $200,800 to $276,100 for California-based roles, along with comprehensive benefits including medical, dental, vision coverage, 401(k), and stock purchase options. This is a hybrid role based in San Francisco, offering the flexibility of both office and remote work.

The successful candidate will have 8+ years of relevant experience, demonstrate strong leadership and communication skills, and have a proven track record in maintaining high-reliability systems at scale. They will join a team committed to ensuring trust and security while driving innovation in Salesforce's marketing technology infrastructure.

Last updated a month ago

Responsibilities For Lead Site Reliability Engineer

  • Ensure reliability, performance, and scalability of critical software systems
  • Lead incident investigations and drive automation initiatives
  • Define and manage SLOs and SLAs
  • Conduct detailed root cause analyses
  • Act as primary point of contact for escalations
  • Mentor junior engineers
  • Develop and execute disaster recovery plans
  • Collaborate across teams including developers, platform engineers, architects, QA, and operations

Requirements For Lead Site Reliability Engineer

Go
Java
Python
Kubernetes
  • 8+ years of relevant industry experience in monitoring, alerting, and visualization systems
  • Advanced expertise with Datadog, Splunk, Grafana, Tableau, New Relic, and PagerDuty
  • Deep knowledge of cloud infrastructures (AWS, Azure, GCP)
  • Experience managing reliability within the Salesforce ecosystem
  • Strong relationship-building skills across technical and business teams
  • Excellent verbal, written, and interpersonal skills

Benefits For Lead Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Time off programs
  • Mental health support
  • Life and disability insurance
  • Employee stock purchasing program

Jobs Related To Salesforce Lead Site Reliability Engineer