Taro Logo

Sr Software Development Engineer

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS · Cloud
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Sr Software Development Engineer

AWS AI is seeking exceptional software developers to join their Deep Learning cross-framework team. This role focuses on developing extensions for TensorFlow and PyTorch machine learning frameworks and creating cross-framework solutions for large-scale Deep Learning model training.

Key responsibilities include:

  • Developing innovative solutions for Large Language Model training across node clusters
  • Implementing model parallelism methods including pipeline and tensor parallelism
  • Creating memory optimization techniques like activation checkpointing/offloading
  • Optimizing distributed training through performance analysis and bottleneck resolution
  • Enhancing communication collectives for AWS network infrastructure

The position is within the SageMaker Engines team, which specializes in developing technology for large-scale Deep Learning model training. The role combines cutting-edge machine learning infrastructure development with high-performance computing optimization.

AWS offers comprehensive benefits including:

  • Competitive compensation with base pay ranging from $151,300 to $261,500 annually
  • Equity compensation and sign-on payments
  • Complete medical, financial, and additional benefits
  • Strong focus on work-life harmony and flexible working culture
  • Extensive career development opportunities and mentorship
  • Inclusive team culture with employee-led affinity groups

The role provides an opportunity to work with industry leaders in a fast-paced, innovative environment while contributing to AWS's position as a thought leader in cloud computing and artificial intelligence.

Last updated a month ago

Responsibilities For Sr Software Development Engineer

  • Developing innovative solutions for Large Language Model training in a cluster of nodes
  • Implementing model parallelism methods such as pipeline and tensor parallelism
  • Implementing sharding of model training state and memory saving techniques
  • Optimizing distributed training through performance analysis
  • Optimizing communication collectives for AWS network infrastructure

Requirements For Sr Software Development Engineer

Python
  • 4+ years of non-internship professional software development experience
  • 4+ years of programming with at least one software programming language
  • 4+ years of leading design or architecture experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree in computer science or equivalent
  • Strong working knowledge of C++ and Python programming languages
  • Experience with CUDA programming
  • Experience with Linux kernel system calls or POSIX API

Benefits For Sr Software Development Engineer

Medical Insurance
Equity
401k
Parental Leave
Education Budget
  • Medical Insurance
  • Equity
  • 401k
  • Parental Leave
  • Education Budget

Interested in this job?