Principal Service Reliability Engineer

World leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. Operating for 40+ years, partnering with industry leaders across sectors.
Site Reliability
Principal Software Engineer
In-Person
5,000+ Employees
10+ years of experience
Enterprise SaaS · Cloud

Description For Principal Service Reliability Engineer

Oracle, the world leader in Enterprise Cloud, is seeking a Principal Service Reliability Engineer to join their ERP Cloud Operations team. This role combines deep technical expertise with service ownership and operational excellence. You'll be responsible for ensuring critical services are designed and delivered with focus on monitoring, telemetry, security, resiliency, scale, and performance.

As a Principal SRE, you'll partner with Service Development teams to engineer and enhance Oracle's SaaS/ERP service portfolio. The role requires expertise in service architecture, operational engineering, and incident response. You'll be instrumental in defining software engineering patterns and practices focused on increasing reliability and resilience of services.

The ideal candidate brings 10+ years of experience, with deep knowledge of cloud platforms, automation, and modern DevOps practices. You'll work with technologies like Kubernetes, Docker, and various monitoring/alerting tools. Strong communication skills are essential as you'll interact with technical and non-technical stakeholders, including executive leadership.

Oracle offers a collaborative environment where you'll be part of a dynamic revolution in cloud-based applications. The company provides competitive benefits, including medical, dental, vision, and retirement options, along with opportunities for work-life balance and community involvement through volunteer programs.

This role is perfect for someone who thrives on solving complex technical challenges, is passionate about service reliability, and wants to make a significant impact at a global technology leader. You'll be joining a team that values innovation, customer success, and technical excellence in building and maintaining large-scale distributed systems.

Last updated 8 days ago

Responsibilities For Principal Service Reliability Engineer

  • Service Ownership - Full stack ownership of services with Service Development
  • Service Design - Partner with SRE Architect in defining and implementing service architecture improvements
  • Operations Engineering - Understand and communicate scale, capacity, security, performance attributes
  • Technical Expert - Handle complex issues and serve as SME during major incidents
  • Incident Response - Author technical content for incident response process
  • Automation - Implement automation and orchestration principles
  • Prevention - Work on solutions to prevent recurring incidents

Requirements For Principal Service Reliability Engineer

Python
Java
Go
Ruby
JavaScript
TypeScript
React
Linux
Kubernetes
  • BS in Computer Science or related field and 7 years relevant experience
  • Minimum of 5 years of software development experience
  • Experience deploying and running large scale online systems built on Cloud platforms
  • 3+ years of experience in systems and network administration, DevOps and/or Site Reliability Engineering
  • Experience with monitoring alerting using technologies like Prometheus, Sensu, Nagios, Kafka
  • Experience implementing Docker, Kubernetes, and Serverless
  • Experience with configuration management systems
  • Knowledge of testing methodologies and automation tools

Benefits For Principal Service Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Competitive benefits based on parity and consistency
  • Flexible medical, life insurance, and retirement options
  • Volunteer programs
  • Work-life balance

Interested in this job?

Jobs Related To Oracle Principal Service Reliability Engineer

Senior Site Reliability Developer (U.S. Citizenship Required)

Principal Site Reliability Developer position at Oracle, focusing on cloud deployments and middleware technologies. US citizenship required, multiple US locations available.

Director of Engineering – Analytics SRE

Lead SRE team for Oracle Health Data Intelligence, overseeing analytics platforms and driving reliability best practices.

Principal Site Reliability Developer

Principal Site Reliability Developer position at Oracle, focusing on cloud services and infrastructure with 10+ years experience required, based in Bengaluru, India.

Principal Site Reliability Developer

Principal Site Reliability Developer position at Oracle, focusing on cloud infrastructure, automation, and distributed systems architecture in Bengaluru.

Sr Principal Site Reliability Developer

Senior Principal Site Reliability Developer position at Oracle, focusing on cloud infrastructure and automation with 10+ years of experience required.