Taro Logo

Sr Site Reliability Engineer

The Walt Disney Company creates world-class entertainment experiences through theme parks, resorts, cruise ships, and media enterprises worldwide.
Celebration, FL 34747, USA
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Job Description

Disney Experiences Tech & Digital (DXT) creates world-class immersive digital experiences for Disney's premier vacation brands including Parks & Resorts worldwide, Disney Cruise Line, Aulani, and Disney Vacation Club. This Senior Site Reliability Engineer role sits in the DSE Technologies Operations organization, working closely with Applications Teams across the company.

The position focuses on coordinating and managing retrospective discussions and troubleshooting operational systems. You'll work with infrastructure and application teams to determine root causes and provide recommendations for long-term fixes and interim mitigation steps, aiming to increase availability and reduce recovery time during system failures.

As an SRE, you'll be instrumental in designing and refreshing lower environment strategy to better support release and deployment activities. The DSE Technology Operations team provides critical operational support for production systems used by guests, cast, and crew for Disney Cruise Line, Disney Vacation Club, and all DSE emerging businesses.

Key responsibilities include driving DevOps culture, designing and building product platforms, implementing automation and monitoring solutions, and performing systems administration across Linux, Windows and Kubernetes environments. You'll work with various cloud platforms including AWS, Google Cloud, and Azure, while utilizing technologies like Git, AWX, and Ansible.

The ideal candidate brings strong systems administration skills, extensive experience with web technologies, and expertise in operational excellence, application stability, security, and capacity management. You'll collaborate with cross-functional teams to ensure comprehensive resolution of system issues while staying current with emerging technologies.

This role offers the opportunity to work with one of the world's most renowned entertainment companies, contributing to systems that power magical experiences for millions of guests. You'll be part of a team that values innovation, operational excellence, and continuous improvement, while working on cutting-edge technologies in a complex, large-scale environment.

The position requires a Bachelor's degree in Computer Science or related field, along with 5+ years of relevant experience. You'll need proficiency in cloud platforms, configuration management tools, and programming languages like Python, Go, or Java. Strong troubleshooting skills and experience with DevOps practices are essential for success in this role.

Last updated a day ago

Responsibilities For Sr Site Reliability Engineer

  • Drive a DevOps culture among peers and developers
  • Design, build, and support products platforms
  • Perform systems administration in Windows, Linux, and Kubernetes platforms
  • Coordinate and organize retrospective discussions following major incidents
  • Design and implement robust monitoring solutions
  • Collaborate with cross-functional teams for system issue resolution
  • Apply SDLC, ITIL, and industry best practices
  • Provide expert-level support in troubleshooting
  • Manage lower environment design, build and management

Requirements For Sr Site Reliability Engineer

Python
Go
Rust
Java
Kubernetes
Linux
  • Minimum 5 years of related work experience
  • Proficient in agile environments
  • Hands-on experience with CI tools like Gitlab, Ansible, and Azure DevOps
  • Experience in procedural programming languages (Python, Perl, Ruby, Java, Go, Rust, C/C++)
  • Skilled in Cloud environments (AWS, Azure, Google Cloud)
  • Proficient in UNIX/Linux/Windows and Kubernetes administration
  • Strong troubleshooting skills across systems, network, and code
  • Bachelor's degree in Computer Science, Information Systems, Software Engineering or related field

Related Jobs

Senior Site Reliability Engineer, Cloud

Senior Site Reliability Engineer position at NVIDIA focusing on cloud infrastructure, Kubernetes, and maintaining large-scale production systems with competitive compensation and remote work options.

Site Reliability Engineer

Senior Site Reliability Engineer position at Runloop, building and maintaining infrastructure for AI development platform, focusing on reliability, security, and performance optimization.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Apple working on satellite connectivity infrastructure for emergency communications services.

Sr. Site Reliability Engineer, Infrastructure Engineering

Senior Site Reliability Engineer position at Amazon Prime Video focusing on infrastructure engineering and cloud systems operations.

Senior Site Reliability Engineer - Observability and Telemetry Platform

Senior SRE role at NVIDIA focusing on observability and telemetry platforms, offering competitive compensation and the opportunity to work with cutting-edge AI technology.