Taro Logo

Sr Software Development Engineer

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS · Cloud

Description For Sr Software Development Engineer

AWS AI is seeking exceptional software developers to join their Deep Learning cross-framework team. This role focuses on developing extensions for TensorFlow and PyTorch machine learning frameworks and creating cross-framework solutions for large-scale Deep Learning model training.

Key responsibilities include:

  • Developing innovative solutions for Large Language Model training across node clusters
  • Implementing model parallelism methods including pipeline and tensor parallelism
  • Creating memory optimization techniques like activation checkpointing/offloading
  • Optimizing distributed training through performance analysis and bottleneck resolution
  • Enhancing communication collectives for AWS network infrastructure

The position is within the SageMaker Engines team, which specializes in developing technology for large-scale Deep Learning model training. The role combines cutting-edge machine learning infrastructure development with high-performance computing optimization.

AWS offers comprehensive benefits including:

  • Competitive compensation with base pay ranging from $151,300 to $261,500 annually
  • Equity compensation and sign-on payments
  • Complete medical, financial, and additional benefits
  • Strong focus on work-life harmony and flexible working culture
  • Extensive career development opportunities and mentorship
  • Inclusive team culture with employee-led affinity groups

The role provides an opportunity to work with industry leaders in a fast-paced, innovative environment while contributing to AWS's position as a thought leader in cloud computing and artificial intelligence.

Last updated 4 minutes ago

Responsibilities For Sr Software Development Engineer

  • Developing innovative solutions for Large Language Model training in a cluster of nodes
  • Implementing model parallelism methods such as pipeline and tensor parallelism
  • Implementing sharding of model training state and memory saving techniques
  • Optimizing distributed training through performance analysis
  • Optimizing communication collectives for AWS network infrastructure

Requirements For Sr Software Development Engineer

Python
  • 4+ years of non-internship professional software development experience
  • 4+ years of programming with at least one software programming language
  • 4+ years of leading design or architecture experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree in computer science or equivalent
  • Strong working knowledge of C++ and Python programming languages
  • Experience with CUDA programming
  • Experience with Linux kernel system calls or POSIX API

Benefits For Sr Software Development Engineer

Medical Insurance
Equity
401k
Parental Leave
Education Budget
  • Medical Insurance
  • Equity
  • 401k
  • Parental Leave
  • Education Budget

Interested in this job?

Jobs Related To Amazon Sr Software Development Engineer

ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

ML Kernel Performance Engineer position at AWS Neuron team, focusing on optimizing deep learning and GenAI workloads for custom ML accelerators.

Machine Learning Engineer, ProServe Shared Delivery Team - Data & AI

Senior Machine Learning Engineer role at AWS Professional Services, focusing on implementing AI/ML solutions for enterprise customers, requiring 5+ years of experience in cloud and ML engineering.

Software Development Engineer, ML Navigators

Senior Software Engineer role at AWS ML-Navigators team, focusing on machine learning network automation and infrastructure development in Dublin, Ireland.

Software Development Engineer III, AI/ML ADC

Senior Software Engineer role at Amazon's AI/ML ADC team, focusing on delivering artificial intelligence and machine learning solutions in isolated, air-gapped cloud environments.

Software Development Engineer, ML Navigators

Senior Software Development Engineer role at AWS ML Navigators team, focusing on machine learning network automation and infrastructure management in Dublin.