Sr Site Reliability Engineer

Disney

The Walt Disney Company creates world-class entertainment experiences through theme parks, resorts, cruise ships, and media enterprises worldwide.

Celebration, FL 34747, USA

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

5+ years of experience

Enterprise SaaS

Job Description

Disney Experiences Tech & Digital (DXT) creates world-class immersive digital experiences for Disney's premier vacation brands including Parks & Resorts worldwide, Disney Cruise Line, Aulani, and Disney Vacation Club. This Senior Site Reliability Engineer role sits in the DSE Technologies Operations organization, working closely with Applications Teams across the company.

The position focuses on coordinating and managing retrospective discussions and troubleshooting operational systems. You'll work with infrastructure and application teams to determine root causes and provide recommendations for long-term fixes and interim mitigation steps, aiming to increase availability and reduce recovery time during system failures.

As an SRE, you'll be instrumental in designing and refreshing lower environment strategy to better support release and deployment activities. The DSE Technology Operations team provides critical operational support for production systems used by guests, cast, and crew for Disney Cruise Line, Disney Vacation Club, and all DSE emerging businesses.

Key responsibilities include driving DevOps culture, designing and building product platforms, implementing automation and monitoring solutions, and performing systems administration across Linux, Windows and Kubernetes environments. You'll work with various cloud platforms including AWS, Google Cloud, and Azure, while utilizing technologies like Git, AWX, and Ansible.

The ideal candidate brings strong systems administration skills, extensive experience with web technologies, and expertise in operational excellence, application stability, security, and capacity management. You'll collaborate with cross-functional teams to ensure comprehensive resolution of system issues while staying current with emerging technologies.

This role offers the opportunity to work with one of the world's most renowned entertainment companies, contributing to systems that power magical experiences for millions of guests. You'll be part of a team that values innovation, operational excellence, and continuous improvement, while working on cutting-edge technologies in a complex, large-scale environment.

The position requires a Bachelor's degree in Computer Science or related field, along with 5+ years of relevant experience. You'll need proficiency in cloud platforms, configuration management tools, and programming languages like Python, Go, or Java. Strong troubleshooting skills and experience with DevOps practices are essential for success in this role.

Last updated a day ago

Responsibilities For Sr Site Reliability Engineer

Drive a DevOps culture among peers and developers
Design, build, and support products platforms
Perform systems administration in Windows, Linux, and Kubernetes platforms
Coordinate and organize retrospective discussions following major incidents
Design and implement robust monitoring solutions
Collaborate with cross-functional teams for system issue resolution
Apply SDLC, ITIL, and industry best practices
Provide expert-level support in troubleshooting
Manage lower environment design, build and management

Requirements For Sr Site Reliability Engineer

Python

Rust

Java

Kubernetes

Linux

Minimum 5 years of related work experience
Proficient in agile environments
Hands-on experience with CI tools like Gitlab, Ansible, and Azure DevOps
Experience in procedural programming languages (Python, Perl, Ruby, Java, Go, Rust, C/C++)
Skilled in Cloud environments (AWS, Azure, Google Cloud)
Proficient in UNIX/Linux/Windows and Kubernetes administration
Strong troubleshooting skills across systems, network, and code
Bachelor's degree in Computer Science, Information Systems, Software Engineering or related field

Disney

The Walt Disney Company creates world-class entertainment experiences through theme parks, resorts, cruise ships, and media enterprises worldwide.

Celebration, FL 34747, USA

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

5+ years of experience

Enterprise SaaS

Related Jobs

Senior Site Reliability Engineer, Cloud

NVIDIA

Senior Site Reliability Engineer position at NVIDIA focusing on cloud infrastructure, Kubernetes, and maintaining large-scale production systems with competitive compensation and remote work options.

Site Reliability Engineer

Runloop

Senior Site Reliability Engineer position at Runloop, building and maintaining infrastructure for AI development platform, focusing on reliability, security, and performance optimization.

Senior Site Reliability Engineer

Apple

Senior Site Reliability Engineer position at Apple working on satellite connectivity infrastructure for emergency communications services.

Sr. Site Reliability Engineer, Infrastructure Engineering

Amazon

Senior Site Reliability Engineer position at Amazon Prime Video focusing on infrastructure engineering and cloud systems operations.

Senior Site Reliability Engineer - Observability and Telemetry Platform

NVIDIA

Senior SRE role at NVIDIA focusing on observability and telemetry platforms, offering competitive compensation and the opportunity to work with cutting-edge AI technology.