Taro Logo

Principal Site Reliability Developer (IC4)

A world leader in cloud solutions, using tomorrow's technology to tackle today's challenges with 40+ years of experience.
$97,500 - $199,500
Site Reliability
Principal Software Engineer
In-Person
5,000+ Employees
7+ years of experience
Enterprise SaaS · Cloud
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Principal Site Reliability Developer (IC4)

We are looking for a Principal Site Reliability Engineer to join our OCI team. This role is part of a globally distributed team responsible for detecting, triaging, and mitigating OCI service-impacting events as quickly as possible. You will be part of one of these regional teams and will be responsible for minimizing the downtime of OCI services. You will achieve this by delivering excellent major incident management and operating systems with high scalability, performance, and security that help prevent incidents from occurring.

Oracle's Cloud is state-of-the-art and constantly evolving. When issues arise, your team will respond within minutes to ensure customer impact is minimized. This role will expose you to the inner workings of OCI's systems and organization. You will interact with and influence leaders across Oracle and drive broad, cross-organization programs aimed at iteratively improving OCI-wide service availability. We are an agile team with significant impact.

As a Principal SRE, you will be responsible for the design and delivery of mission-critical infrastructure, focusing on security, resiliency, scale, and performance. You will work closely with development teams to improve service architecture and implement best practices for cloud operations. The role requires deep technical expertise in cloud platforms, automation, and modern DevOps practices, making you a key contributor to Oracle's cloud infrastructure reliability and performance.

Last updated 16 days ago

Responsibilities For Principal Site Reliability Developer (IC4)

  • Design and delivery of mission critical stack with focus on security, resiliency, scale, and performance
  • Partner with development teams in defining and implementing service architecture improvements
  • Guide Development Teams to engineer and add premier capabilities to Oracle Cloud
  • Act as ultimate escalation point for complex or critical issues
  • Troubleshoot issues and define mitigations using deep understanding of service topology
  • Implement automation and orchestration principles

Requirements For Principal Site Reliability Developer (IC4)

Python
Java
Kubernetes
Linux
  • Minimum 7 years of hands-on Platform Engineering, DevOps or SRE experience
  • Experience with public cloud (OCI, AWS, GCP, Azure)
  • Knowledge of Infrastructure as Code (IaaC), Configuration as Code (CaC), GitOps
  • Experience with Python or Java
  • Experience with Kubernetes and cloud infrastructure
  • Experience with monitoring tools like Prometheus, Grafana, EFK/ELK
  • Experience with CI/CD pipelines
  • Strong Linux/Unix environment experience
  • BS or MS in Computer Science, Computer Engineering, or equivalent
  • Must be eligible to obtain & maintain a US government security clearance

Benefits For Principal Site Reliability Developer (IC4)

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Mental Health Assistance
  • Medical, dental, and vision insurance
  • Short term and long term disability
  • Life insurance and AD&D
  • Health care and dependent care Flexible Spending Accounts
  • 401(k) Savings and Investment Plan with company match
  • Flexible Vacation
  • 11 paid holidays
  • Paid sick leave
  • Paid parental leave
  • Adoption assistance
  • Employee Stock Purchase Plan

Interested in this job?