Site Reliability Engineer - GovCloud 24x7

Salesforce is a leading cloud-based customer relationship management (CRM) platform.
$114,200 - $157,100
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud · Cybersecurity
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, ATS Matrix Site Reliability Engineer

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems, combining software development and systems engineering expertise.

Description For Site Reliability Engineer - GovCloud 24x7

Salesforce is seeking a Site Reliability Engineer for their GovCloud 24x7 team. This role is part of the GovCloud Incident Response (GIR) team, which maintains the current infrastructure with daily alert response, smart hands, and incident management. The ideal candidate must be a U.S. Citizen operating on U.S. Soil with the ability to meet customer and government screening standards.

Key responsibilities include:

  • Maintaining customer-facing services at top performance
  • Managing incidents and participating in technical reviews
  • Conducting problem management and participating in RCAs
  • Ensuring compliance with company policies and directives
  • Collaborating with other technical staff to solve issues
  • Staying updated on industry innovations and technologies

The role requires working on a 24/7 team with rotating day and night shifts and participating in an on-call rotation. Candidates should have expertise in TCP/IP technologies, Unix variants (especially Linux and Solaris), monitoring security systems, and incident management. Experience with AWS/C2S infrastructure, scripting languages, and ITIL service operations is essential.

Preferred qualifications include experience with Chef/Puppet, Jenkins/Bamboo/Spinnaker, Java applications, Kubernetes, and certifications in Linux+, RedHat, and AWS. Familiarity with Agile and DevOps processes, as well as experience in resilience engineering and post-incident investigations, is highly valued.

This challenging role offers the opportunity to work with cutting-edge technologies in a dynamic, high-stakes environment, supporting critical government cloud infrastructure. Join Salesforce's GovCloud team to make a significant impact on the reliability and performance of essential services.

Last updated a month ago

Responsibilities For Site Reliability Engineer - GovCloud 24x7

  • Maintain customer-facing services at top performance
  • Manage incidents and participate in technical reviews
  • Conduct problem management and participate in RCAs
  • Ensure compliance with company policies
  • Collaborate with technical staff to solve issues
  • Stay updated on industry innovations
  • Work on a 24/7 team with rotating shifts
  • Participate in on-call rotation

Requirements For Site Reliability Engineer - GovCloud 24x7

Linux
Python
Go
  • U.S. Citizenship
  • Ability to meet government screening standards
  • Systems engineering experience in enterprise scale internet service
  • Expertise in TCP/IP technologies
  • Expertise in Unix variants (Linux/Solaris/BSD)
  • Strong understanding of monitoring security systems
  • Strong communication skills
  • Experience in Incident Management and ITIL service operations
  • Experience with AWS/C2S infrastructure
  • Scripting skills in Python, Go, or other languages

Interested in this job?