Join Amazon's Infrastructure Reliability Engineering team as a Senior Software Development Engineer to make a significant impact on reducing MTTR and improving service availability across Fulfillment Technology and Robotics. This role focuses on innovating in global network device monitoring, telemetry collection, and pioneering new relational monitoring tools. You'll be leveraging AI to build automatic detection and remediation solutions, reducing human intervention during high-severity events.
The position offers extensive scope to work across organizations and influence the technical direction of multiple teams. You'll be responsible for architecting and developing systems that monitor and maintain service health at scale, spanning thousands of global sites. The role combines cutting-edge technology with practical problem-solving, as you'll work with diverse telemetry sources including software applications, AWS services, network paths, and device fleets.
As part of Amazon's Infrastructure Reliability Engineering team, a global organization with presence in both the USA and Europe, you'll collaborate with talented engineers worldwide. The team is dedicated to building tools that improve the availability of network and service infrastructure across Amazon's global fulfillment network.
This role offers comprehensive benefits including medical, dental, and vision coverage, parental leave options, PTO, and a 401(k) plan. It's an excellent opportunity for experienced engineers who want to make a lasting impact on global infrastructure reliability while working with cutting-edge technology and leading teams.