Taro Logo

Sr Software Development Engineer

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Sr Software Development Engineer

AWS AI is seeking exceptional software developers for their Deep Learning cross-framework team. This role focuses on developing extensions for TensorFlow and PyTorch machine learning frameworks and creating cross-framework solutions for large-scale Deep Learning model training.

Key Responsibilities:

  • Developing innovative solutions for Large Language Model training in distributed clusters
  • Implementing model parallelism methods including pipeline and tensor parallelism for PyTorch
  • Optimizing distributed training through profiling and performance improvements
  • Implementing model state sharding and memory optimization techniques
  • Optimizing network communication for AWS infrastructure

The position is within the SageMaker Engines team, which develops technology for large-scale Deep Learning model training. The role combines deep technical expertise with innovative problem-solving in a fast-paced environment.

AWS offers comprehensive benefits including:

  • Competitive salary range: $151,300 - $261,500 per year
  • Equity compensation
  • Full medical, financial, and other benefits
  • Flexible work arrangements promoting work-life harmony
  • Career development and mentorship opportunities
  • Inclusive team culture with employee-led affinity groups

The ideal candidate will have strong expertise in C++ and Python programming, experience with TensorFlow/PyTorch frameworks, and a solid understanding of distributed systems and machine learning concepts. This is an opportunity to work with cutting-edge technology while contributing to AWS's position as a leader in cloud computing and AI infrastructure.

Last updated 2 months ago

Responsibilities For Sr Software Development Engineer

  • Developing innovative solutions for Large Language Model training in clusters
  • Implementing model parallelism methods for PyTorch framework
  • Implementing sharding of model training state and memory optimization
  • Optimizing distributed training performance
  • Optimizing communication collectives for AWS network infrastructure

Requirements For Sr Software Development Engineer

Python
Kubernetes
  • 4+ years of non-internship professional software development experience
  • 4+ years of programming with at least one software programming language
  • 4+ years of leading design or architecture experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree in computer science or equivalent

Benefits For Sr Software Development Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Equity
  • Competitive salary range
  • Equity compensation
  • Medical, financial, and other benefits
  • Flexible work arrangements
  • Career development and mentorship
  • Inclusive team culture

Interested in this job?