Deep Learning Engineer - Distributed Task-Based Backends

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins that transform industries.
$148,000 - $287,500
Machine Learning
Staff Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI

Description For Deep Learning Engineer - Distributed Task-Based Backends

NVIDIA, the world leader in accelerated computing, is seeking a Senior to Principal level Deep Learning Engineer to revolutionize distributed backends for major frameworks like PyTorch, JAX, and TensorFlow. This role combines cutting-edge AI development with high-performance computing, focusing on scaling AI models across thousands of GPUs.

The position offers an opportunity to work with premier Deep Learning frameworks and task-based runtime systems like Legate, Legion & Realm. You'll be at the forefront of developing compiler optimizations, parallelization strategies, and performance debugging tools for large-scale AI models. This role is perfect for someone who combines deep technical expertise in distributed systems with practical machine learning engineering experience.

The ideal candidate will have 5+ years of experience, strong programming skills in Python and C++, and extensive knowledge of parallel and distributed programming, particularly with GPUs. You'll work directly with enterprise customers and collaborate across NVIDIA's teams to shape the future of distributed GPU computing.

This role offers competitive compensation ($148,000-$287,500 base salary) plus equity, and provides the flexibility of working remotely or from NVIDIA's Santa Clara office. You'll be part of a company that's transforming industries through AI and digital twins, working on challenges that directly impact the advancement of accelerated computing technology.

Join NVIDIA to help build the next generation of distributed AI systems, working with cutting-edge technology and some of the brightest minds in the industry. Your work will directly influence how AI models scale and perform across massive distributed systems, making a real impact on the future of AI computing.

Last updated 8 hours ago

Responsibilities For Deep Learning Engineer - Distributed Task-Based Backends

  • Develop extensions to popular Deep Learning frameworks for parallelization strategies
  • Develop compiler optimizations and parallelization heuristics
  • Develop tools for performance debugging of AI models at large scales
  • Study and tune Deep Learning training workloads at large scale
  • Support enterprise customers and partners to scale novel models
  • Collaborate with Deep Learning software and hardware teams
  • Contribute to runtime systems development for distributed GPU computing

Requirements For Deep Learning Engineer - Distributed Task-Based Backends

Python
  • BS, MS or PhD degree in Computer Science, Electrical Engineering or related field
  • 5+ years of relevant industry experience or equivalent academic experience after BS
  • Proficient with Python and C++ programming
  • Strong background with parallel and distributed programming, preferably on GPUs
  • Hands-on development skills using Machine Learning frameworks
  • Understanding of Deep Learning training in distributed contexts

Benefits For Deep Learning Engineer - Distributed Task-Based Backends

Equity
  • Equity compensation
  • Comprehensive benefits package

Interested in this job?

Jobs Related To NVIDIA Deep Learning Engineer - Distributed Task-Based Backends

Director, AI Software

Lead AI software development and team building for NVIDIA's Metropolis manufacturing platform, driving innovation in computer vision and data analytics.

Senior Research Engineer, Foundation Model Training Infrastructure

Senior Research Engineer role at NVIDIA focusing on building infrastructure for large-scale foundation model training in robotics, offering competitive compensation and the opportunity to work on cutting-edge AI technology.

Senior Deep Learning Performance Architect

Senior Deep Learning Performance Architect role at NVIDIA focusing on developing advanced processor architectures for AI acceleration, offering competitive compensation and the chance to shape the future of machine learning.

Senior Deep Learning Performance Architect

Senior Deep Learning Performance Architect position at NVIDIA, developing next-generation AI architectures with competitive compensation and opportunity to work on cutting-edge technology.

Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

Senior Machine Learning Compiler Engineer position at AWS Neuron team, focusing on optimizing ML models for AWS Inferentia and Trainium custom chips.