Production Support Engineering LMTS

Global leader in CRM and enterprise cloud computing solutions
$184,000 - $276,100
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS

Description For Production Support Engineering LMTS

Join MuleSoft, a Salesforce Company, as a Site Reliability Engineer (SRE) focused on maintaining and improving the reliability of cloud infrastructure. This role combines software engineering with operations expertise to ensure high availability and performance of distributed systems. You'll be responsible for full stack observability, event response, and incident management, particularly in GovCloud environments. The position requires deep technical expertise in cloud platforms, monitoring tools, and automation, with a special focus on maintaining FedRAMP compliance. You'll work with cutting-edge technologies including AWS, Kubernetes, and various monitoring and automation tools. This is a unique opportunity to impact the reliability of enterprise-scale cloud services while working with a team dedicated to maintaining industry-leading uptime standards. The role requires U.S. citizenship and ability to obtain government clearances, making it ideal for those interested in working with federal systems. You'll be part of a company known for its innovative culture and strong values, while helping to shape the future of cloud reliability engineering.

Last updated 2 days ago

Responsibilities For Production Support Engineering LMTS

  • Maintain and improve service reliability, availability, and performance across distributed systems
  • Design, build, and maintain comprehensive monitoring, logging, and alerting systems
  • Respond to production incidents and perform root cause analysis
  • Automate repetitive tasks using scripts and infrastructure-as-code tools
  • Monitor usage trends and forecast growth
  • Maintain and improve continuous integration and continuous delivery pipelines
  • Collaborate with security teams to ensure systems adhere to compliance requirements
  • Work closely with development teams to design resilient systems
  • Create and maintain documentation for runbooks, systems, and processes

Requirements For Production Support Engineering LMTS

Python
Go
Kubernetes
  • 8+ years experience in a SRE role or related field
  • Experience in Public Cloud environments, specifically with AWS
  • Experience with New Relic, collectd, Splunk, Sumo Logic, Grafana, Terraform, Jenkins, Kubernetes, Spinnaker
  • Excellent knowledge of Internet technologies and protocols
  • Strong experience with API fundamentals
  • Ability to root cause issues in high-traffic, large-scale distributed systems
  • Experience with development in Python, Go, Bash
  • Experience with FedRAMP environments
  • A related technical degree required
  • Must be a U.S. citizen operating on U.S. Soil

Benefits For Production Support Engineering LMTS

Medical Insurance
401k
Dental Insurance
Vision Insurance
  • Competitive salary range
  • Medical, dental, and vision coverage
  • 401k benefits
  • Professional development opportunities

Interested in this job?

Jobs Related To Salesforce Production Support Engineering LMTS

Staff Site Reliability Engineer

Staff Site Reliability Engineer position at Fivetran, focusing on infrastructure reliability, monitoring, and system evolution with hybrid work in Denver.

Site Reliability Engineer

Microsoft Site Reliability Engineer position in Cloud+AI team, focusing on secure infrastructure and Azure services deployment, offering hybrid work and competitive compensation.

Site Reliability Developer 3

Site Reliability Developer role at Oracle focusing on cloud infrastructure, automation, and system reliability with emphasis on security and scalability.

Site Reliability Developer 3

Site Reliability Developer role at Oracle focusing on cloud infrastructure, automation, and system reliability with emphasis on security and scalability.

Site Reliability Developer 3

Oracle is hiring a Site Reliability Developer 3 to design, implement, and maintain secure, scalable infrastructure for cloud services, focusing on automation and system reliability.