Site Reliability Engineer, ESC Managed Operations

Amazon

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Enterprise SaaS · Cloud

Description For Site Reliability Engineer, ESC Managed Operations

AWS is launching its first European Sovereign Cloud (ESC), a groundbreaking development in utility computing. This role is part of the AWS Managed Operations team, focusing on building and leading operations for high-availability AWS services like EC2, S3, Dynamo, Lambda, and Bedrock specifically for EU customers. The position involves working with global AWS teams to influence service evolution and ensure optimal performance.

As a Site Reliability Engineer, you'll be instrumental in the ESC launch planned for 2025. Your daily responsibilities will include collaborating with technology leaders, enhancing operations, and improving system availability, reliability, and performance. The role requires participation in on-call rotations for incident management.

AWS Utility Computing (UC) is at the forefront of product innovation, from foundational services like S3 and EC2 to cutting-edge features that maintain AWS's industry leadership. The Managed Operations team specifically works with customers requiring specialized security solutions for cloud services.

The ideal candidate brings 3+ years of software development experience with proficiency in modern programming languages and strong Linux/networking fundamentals. You'll need excellent troubleshooting abilities across all system levels and experience with cloud systems operations.

Amazon offers a supportive work environment with emphasis on work-life harmony, diverse experiences, and continuous learning. The company provides comprehensive benefits including relocation support for EU candidates, mentorship opportunities, and involvement in employee-led affinity groups. This role offers the chance to shape the future of cloud computing while working with cutting-edge technologies in a collaborative, innovative environment.

Last updated 2 days ago

Responsibilities For Site Reliability Engineer, ESC Managed Operations

Oversee the launch of the European Sovereign Cloud (ESC) in 2025
Work with global AWS teams and influence AWS services evolution
Enhance day-to-day operations and improve availability, reliability, latency, performance
Participate in on-call rotations for incident resolution
Collaborate with technology leaders
Ensure high-availability experience for EU customers

Requirements For Site Reliability Engineer, ESC Managed Operations

Linux

Python

Java

TypeScript

Ruby

3+ years of experience in software development with proficiency in Java, Typescript, Python, or Ruby
3+ years of experience with Linux, command line, and computer networking fundamentals
Ability to troubleshoot at all levels from network to operating systems to software applications
Experience supporting cloud systems or other services
Fluency in written and spoken English
Legal right to work in Ireland

Benefits For Site Reliability Engineer, ESC Managed Operations

Relocation Benefits

Relocation support for EU candidates
Work-life harmony focus
Mentorship and career growth opportunities
Employee-led affinity groups
Ongoing learning experiences

Amazon

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Enterprise SaaS · Cloud

Interested in this job?

Jobs Related To Amazon Site Reliability Engineer, ESC Managed Operations

Site reliability/Platform Engineer/Sys Dev Engineer, ESC

Amazon

AWS System Development Engineer position focusing on cloud infrastructure management, combining software development with systems engineering to maintain and improve AWS's global network infrastructure.

Software Engineer - Incident Management

Datadog

Software Engineer position at Datadog focusing on incident management, building tools and processes to improve system reliability and incident response across the organization.

ASE -Site Reliability Engineer

Apple

Site Reliability Engineer role at Apple focused on distributed systems and coordination services, offering competitive pay and comprehensive benefits.

Site reliability/Platform Engineer/Sys Dev Engineer, ESC

Amazon

Software Developer III, Site Reliability Development, Google Cloud

Google

Site Reliability Development Engineer position at Google Cloud, focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.