Salesforce is seeking a Lead Site Reliability Engineer to join their Marketing Automation Platform & Data Operations team within the Marketing Technology organization. This role is crucial in ensuring the reliability and operational efficiency of Salesforce's critical Marketing Technology ecosystem. The position requires an experienced engineer who will bridge software engineering and system administration, with particular focus on monitoring, visualization, and alerting tools.
The ideal candidate will take ownership of service reliability, lead incident investigations, and drive automation initiatives to enhance system stability. They will work with various monitoring and visualization platforms including Datadog, Splunk, Grafana, and New Relic, while managing reliability within the Salesforce ecosystem including Slack, Data Cloud, Tableau, and Heroku.
Key responsibilities include managing cloud infrastructure, implementing Infrastructure as Code, maintaining CI/CD pipelines, and leading incident response efforts. The role requires expertise in scripting languages like Python, Go, and Java, along with strong experience in cloud platforms (AWS, Azure, GCP) and tools like Terraform and Kubernetes.
The position offers competitive compensation ranging from $200,800 to $276,100 for California-based roles, along with comprehensive benefits including medical, dental, vision coverage, 401(k), and stock purchase options. This is a hybrid role based in San Francisco, offering the flexibility of both office and remote work.
The successful candidate will have 8+ years of relevant experience, demonstrate strong leadership and communication skills, and have a proven track record in maintaining high-reliability systems at scale. They will join a team committed to ensuring trust and security while driving innovation in Salesforce's marketing technology infrastructure.