Site Reliability Engineer

A world leader in cloud solutions that uses tomorrow's technology to tackle today's challenges, partnering with industry-leaders in almost every sector for over 40+ years.
San José Province, San José, Costa Rica
Site Reliability
Mid-Level Software Engineer
Hybrid
5,000+ Employees
3+ years of experience
Enterprise SaaS · Cloud

Description For Site Reliability Engineer

Oracle is seeking a Site Reliability Engineer to join their NetSuite application team, focusing on ensuring uptime, performance, and reliability across their production environments as they expand into all regions of Oracle Cloud Infrastructure. This hybrid role combines technical expertise with operational excellence, requiring 3-5+ years of experience in large-scale production environments.

The position involves working with cutting-edge cloud infrastructure, where you'll be responsible for solving complex problems and building automation to prevent issues. You'll be part of a world-class Site Reliability team, bringing expertise in monitoring, backups, infrastructure, and systems architecture. The role requires collaboration with multiple engineering teams in Netsuite Cloud Operations, including system engineering, network team, infrastructure engineering, security, and maintenance teams.

As an SRE at Oracle, you'll work with modern technologies including Linux, Python, Kubernetes, Redis, Cassandra, and Kafka. You'll use monitoring tools like Kibana, Icinga, and Prometheus/Grafana to maintain and improve system reliability. The position offers the opportunity to work on large-scale distributed systems while ensuring the highest levels of service availability and performance.

Oracle provides a comprehensive benefits package including medical, dental, vision insurance, 401k, and parental leave. The company promotes work-life balance and offers opportunities for professional growth within a global technology leader. This role is perfect for someone who values simplicity and scale, works well in collaborative environments, and is passionate about maintaining and improving critical infrastructure systems.

The ideal candidate should have a BS in Computer Science or related field, strong Linux systems knowledge, excellent troubleshooting skills, and the ability to work under pressure in time-critical situations. You'll be joining a company with over 40 years of industry leadership, working on systems that serve customers globally while contributing to the evolution of cloud computing technology.

Last updated 6 days ago

Responsibilities For Site Reliability Engineer

  • Ensure Oracle Netsuite NSGBU Cloud Operations systems are operational
  • Resolve site incidents on various levels of infrastructure
  • Work with monitoring and analytic tools
  • Participate in 24x7 Follow the Sun Operational coverage
  • Design and implement improvements in service architecture
  • Facilitate service capacity planning and demand forecasting
  • Solve complex problems related to infrastructure cloud services
  • Build automation to prevent problem recurrence
  • Collaborate with multiple engineering teams

Requirements For Site Reliability Engineer

Linux
Python
Kubernetes
Redis
Cassandra
Kafka
  • BS in Computer Science or related field
  • 3-5+ years of experience
  • Linux systems internals knowledge
  • Understanding of web technologies, Apache, HTTPS/SSL
  • Knowledge of database environments
  • Excellent communication skills in English
  • Experience with monitoring and analytic tools (Kibana, Icinga, Prometheus/Grafana)
  • Understanding of distributed systems
  • Scripting knowledge in Bash, Perl, Python
  • Experience with orchestration tools (SaltStack, Terraform, Kubernetes, Ansible)

Benefits For Site Reliability Engineer

Medical Insurance
Vision Insurance
Dental Insurance
401k
Parental Leave
  • Competitive benefits package
  • Medical, life insurance, and retirement options
  • Work-life balance
  • Volunteer programs
  • Equal Employment Opportunity

Interested in this job?

Jobs Related To Oracle Site Reliability Engineer

Site Reliability Developer 3

Oracle Site Reliability Developer position focusing on cloud infrastructure, automation, and system reliability, offering competitive salary and benefits in Seattle.

Site Reliability Developer 2

Site Reliability Developer position at Oracle focusing on cloud infrastructure, automation, and system reliability with 3-5+ years experience required.

Site Reliability Developer Join OCI-Ns2

Site Reliability Developer position at Oracle's National Security Region team, requiring TS/SCI clearance, focusing on maintaining and optimizing secure cloud infrastructure services.

Software Engineer - Incident Management

Software Engineer role at Datadog focusing on incident management and SRE responsibilities, offering competitive compensation and comprehensive benefits.

Software Developer III, Site Reliability Development, Google Cloud

Site Reliability Development Engineer position at Google Cloud, focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.