Amazon's Machine Learning training infrastructure (ML Infra) team is seeking a Senior Machine Learning Engineer to spearhead the development of next-generation ML training infrastructure. This high-impact role involves designing and implementing large-scale computing infrastructure that powers cutting-edge AI initiatives across Amazon.
The position sits within the Amazon General Intelligence (AGI) team, working at the forefront of artificial intelligence. You'll be responsible for developing hyper-scalable, general-purpose large model training and inference systems that revolutionize machine perception, interpretation, and interaction with humans and the physical world.
As a senior engineer, you'll lead the most advanced and challenging projects spanning ML infrastructure, working with state-of-the-art AI frameworks, hardware accelerators, and distributed computing systems. You'll collaborate with top minds in deep learning and reinforcement learning while mentoring other engineers and establishing best practices.
The role offers competitive compensation ranging from $151,300 to $261,500 based on location, plus equity and comprehensive benefits. You'll have the opportunity to shape the future of AI technology while working with Amazon's vast resources and talented teams.
Key technical areas include distributed systems, parallel computing, AI frameworks (PyTorch, TensorFlow, JAX), containerization, and cloud computing. You'll need strong expertise in Python, C++, or Rust, along with deep knowledge of ML infrastructure and hardware architectures.
This is an exceptional opportunity to drive innovation in AI infrastructure at scale, mentor other engineers, and make a lasting impact on the future of machine learning technology. Join Amazon's AGI team to work on breakthrough research and product development that pushes the boundaries of what's possible in artificial intelligence.