Are you passionate about solving complex distributed systems challenges at scale? Join Oracle as a Site Reliability Engineer and help shape the reliability, scalability, and performance of Oracle Cloud Infrastructure (OCI). As part of the Site Reliability Engineering (SRE) team, you'll contribute to designing, automating, and evolving mission-critical systems that directly impact thousands of customers worldwide.
The role requires advanced Linux systems administration, strong Python coding skills, and experience with CI/CD pipelines. You'll be responsible for ensuring end-to-end reliability across various services, building automation tools, and maintaining system health metrics. Key responsibilities include designing software for enhanced availability, managing SLOs/SLAs, and participating in on-call rotations.
Oracle offers a collaborative environment where you'll work with cutting-edge cloud technology and contribute to large-scale distributed systems. The position combines deep technical expertise with modern software engineering practices, making it ideal for engineers passionate about system reliability and automation.
As an SRE at Oracle, you'll have the opportunity to influence architectural decisions, lead post-incident reviews, and build tools that enhance operational efficiency. The role offers competitive benefits, including medical and life insurance, retirement options, and work-life balance. Oracle is committed to diversity and inclusion, providing equal opportunities for all qualified candidates.
This position requires 3-5+ years of experience and strong English language skills. You'll be based in Zapopan, Mexico, working with global teams to maintain and improve Oracle's cloud infrastructure. The role offers significant growth potential and the chance to work with industry-leading cloud technology while solving complex technical challenges.