System Development Engineer - Incident Management, IT Services

Global technology and e-commerce company that operates the world's largest online retail platform.
DevOps
Mid-Level Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Enterprise SaaS · E-Commerce

Description For System Development Engineer - Incident Management, IT Services

Amazon Consumer Tier One Support (C-TOS) is seeking a System Development Engineer to join their first line of defense for maintaining high availability in the Amazon Retail Website. This role is crucial in making customer impacting events shorter, less frequent, and less severe through large-scale event and incident management. Working with a globally distributed team across Austin, Dublin, and Sydney, you'll be part of a 24x7 coverage model working 10-hour shifts for 4 days a week.

The position combines hands-on development of automation tools with incident management responsibilities. You'll build tooling to automate the detection and resolution of issues within Amazon's Retail Website infrastructure, while also leading conference calls and directing the resolution of high-visibility incidents. The role offers the opportunity to make a significant impact at scale, as the Amazon Retail Website serves hundreds of millions of customers globally.

As part of the team, you'll work on projects to expand tooling usage across Amazon, analyze incident data to drive process improvements, and contribute to making future events less severe or preventable entirely. The team is rapidly growing and expanding its offerings globally, making it an exciting time to join.

The ideal candidate should have a strong background in infrastructure automation, experience with modern programming languages, and familiarity with Linux/Unix environments. Knowledge of CI/CD pipelines and experience with distributed systems at scale is highly valued. This role offers excellent growth potential and the opportunity to make a substantial impact on Amazon's global retail platform reliability.

Working at Amazon, you'll be part of a company known for its innovation, customer-centric approach, and robust technical infrastructure. The role offers the chance to work with cutting-edge technologies while solving complex problems that affect millions of customers worldwide.

Last updated 15 minutes ago

Responsibilities For System Development Engineer - Incident Management, IT Services

  • Drive the resolution of large scale customer impacting issues as part of a globally rotating team
  • Design, build, and enhance incident detection and management tools
  • Participate in Agile sprints to evolve business processes and technologies
  • Create and review documentation; design new standard operating procedures
  • Identify and troubleshoot recurring platform issues and own projects to drive improvements
  • Mentor peers in technical and operational areas

Requirements For System Development Engineer - Incident Management, IT Services

Python
Ruby
Go
Java
Linux
  • Experience in automating, deploying, and supporting large-scale infrastructure
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
  • Experience with Linux/Unix
  • Experience with CI/CD pipelines build processes

Benefits For System Development Engineer - Incident Management, IT Services

Medical Insurance
Dental Insurance
Vision Insurance
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • 4-day work week with 10-hour shifts

Interested in this job?

Jobs Related To Amazon System Development Engineer - Incident Management, IT Services

System Development Engineer II, AWS Network Infrastructure

AWS Network Infrastructure seeks experienced System Development Engineer II to build and maintain global connectivity solutions, combining development, operations, and customer focus.

Systems Development Engineer II, Region Reliability: ADC Tiger Team

Systems Development Engineer II position at Amazon Web Services focusing on ADC region reliability and automation, requiring TS/SCI clearance and strong systems engineering background.

System Development Engineer, Region Reliability Engineering

System Development Engineer position at AWS focusing on Region Reliability Engineering to automate and maintain cloud infrastructure operations.

M365 - IT Application Dev Engineer, Kuiper Production Operations

M365 IT Application Dev Engineer role at Amazon's Project Kuiper, focusing on implementing and managing Microsoft 365 services for satellite broadband operations.

Support Engineer, Support Engineering

Support Engineer role at Amazon focusing on technical support for seller services, requiring 2+ years of experience in software development or technical support, with expertise in Unix and web services.