Sr Software Development Engineer

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS

Description For Sr Software Development Engineer

AWS AI is seeking exceptional software developers to join their Deep Learning cross-framework team. This role sits within AWS's pioneering cloud computing division, focusing on developing cutting-edge solutions for machine learning and AI infrastructure.

The position involves working with TensorFlow and PyTorch frameworks, developing solutions for training Deep Learning models at massive scale. You'll be part of the SageMaker Engines team, working alongside leading engineers and researchers in the field. The role combines deep technical expertise in machine learning frameworks with distributed systems engineering.

Key responsibilities include implementing sophisticated model parallelism methods, optimizing distributed training performance, and developing memory-efficient training techniques. You'll work on systems that handle thousands of accelerators and contribute to AWS's position as a thought leader in cloud-based AI infrastructure.

The ideal candidate brings strong software development experience, particularly in C++ and Python, with a deep understanding of machine learning frameworks and distributed systems. You'll need to demonstrate leadership experience and the ability to mentor others, as well as strong architectural design skills.

AWS offers a comprehensive benefits package, including competitive base pay ranging from $151,300 to $261,500 depending on location, plus equity and other benefits. The company strongly values diversity and inclusion, providing numerous employee-led affinity groups and ongoing learning opportunities.

This role offers unique opportunities to work on cutting-edge AI infrastructure, with access to AWS's vast resources and the chance to impact how the world's leading companies train their AI models. You'll be part of a culture that emphasizes continuous learning, work-life harmony, and professional growth.

The position is based in the San Francisco Bay Area and requires hands-on experience with machine learning frameworks, distributed systems, and high-performance computing. Join AWS to help shape the future of cloud-based AI infrastructure while working with some of the industry's most advanced technologies and talented professionals.

Last updated 12 hours ago

Responsibilities For Sr Software Development Engineer

  • Developing innovative solutions for supporting Large Language Model training in a cluster of nodes
  • Implementing model parallelism methods such as pipeline and tensor parallelism as extensions to the PyTorch framework
  • Implementing sharding of the model training state, activation checkpointing/offloading and other memory saving techniques
  • Optimizing distributed training by profiling, identifying bottlenecks and addressing them
  • Optimizing communication collectives for the AWS network infrastructure

Requirements For Sr Software Development Engineer

Python
TypeScript
  • 4+ years of non-internship professional software development experience
  • 4+ years of programming with at least one software programming language
  • 4+ years of leading design or architecture of new and existing systems
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree in computer science or equivalent
  • Strong working knowledge of C++ programming language
  • Strong working knowledge of Python programming language

Benefits For Sr Software Development Engineer

Medical Insurance
Equity
Mental Health Assistance
  • Medical, financial, and other benefits
  • Equity compensation
  • Mentorship and career growth opportunities
  • Work-life harmony
  • Inclusive team culture

Interested in this job?

Jobs Related To Amazon Sr Software Development Engineer

Sr. Machine Learning Engineer, Amazon General Intelligence (AGI)

Senior Machine Learning Engineer role focused on developing cutting-edge LLMs and Generative AI solutions at Amazon's AGI team.

Machine Learning Engineer, Amazon General Intelligence (AGI)

Senior Machine Learning Engineer role at Amazon's AGI team, focusing on developing cutting-edge LLMs and Generative AI applications.

Applied Scientist, AWS SAAR

Senior Applied Scientist role at AWS focusing on machine learning and security analytics, offering competitive compensation and growth opportunities in a collaborative environment.

Senior Software Development Engineer, Customer Engagement Technology

Senior Software Engineer role at Amazon focusing on AI-powered customer service technology, building conversational AI systems and LLM infrastructure.

Software Development Engineer AI/ML

Senior Software Engineer role at Amazon focusing on AI/ML development for next-generation shopping experiences, offering competitive compensation and benefits.