At AWS AI, we're building the next-generation AI platform to accelerate LLM and Generative AI development through Amazon SageMaker. This role focuses on developing distributed machine learning systems and large-scale solutions for our worldwide customer base. You'll be working on the SageMaker HyperPod team, building platform and products for large scale deep learning model training (100+ billion parameter GPT, 1000s of GPU devices).
The position offers an opportunity to work with cutting-edge AI technology and shape the future of machine learning infrastructure. You'll collaborate with ML scientists and customers to influence overall strategy and define the team's roadmap. The role involves designing and implementing robust, scalable solutions while maintaining high engineering standards.
AWS provides a dynamic work environment with emphasis on work-life harmony, offering flexible hybrid work arrangements. The company strongly values diversity and inclusion, demonstrated through employee-led affinity groups and ongoing learning experiences. Career growth is supported through mentorship and knowledge-sharing opportunities.
Key technical aspects include:
The role combines technical leadership with hands-on development, requiring both strong engineering skills and the ability to mentor others. You'll be part of AWS's mission to democratize AI technology while working with some of the most advanced machine learning infrastructure in the industry.