Deep Learning Engineer - Distributed Task-Based Backends

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins that transform industries.

Santa Clara, CA, USA

$148,000 - $287,500

Machine Learning

Staff Software Engineer

Remote

5,000+ Employees

5+ years of experience

Description For Deep Learning Engineer - Distributed Task-Based Backends

NVIDIA, the world leader in accelerated computing, is seeking a Senior to Principal level Deep Learning Engineer to revolutionize distributed backends for major frameworks like PyTorch, JAX, and TensorFlow. This role combines cutting-edge AI development with high-performance computing, focusing on scaling AI models across thousands of GPUs.

The position offers an opportunity to work with premier Deep Learning frameworks and task-based runtime systems like Legate, Legion & Realm. You'll be at the forefront of developing compiler optimizations, parallelization strategies, and performance debugging tools for large-scale AI models. This role is perfect for someone who combines deep technical expertise in distributed systems with practical machine learning engineering experience.

The ideal candidate will have 5+ years of experience, strong programming skills in Python and C++, and extensive knowledge of parallel and distributed programming, particularly with GPUs. You'll work directly with enterprise customers and collaborate across NVIDIA's teams to shape the future of distributed GPU computing.

This role offers competitive compensation ($148,000-$287,500 base salary) plus equity, and provides the flexibility of working remotely or from NVIDIA's Santa Clara office. You'll be part of a company that's transforming industries through AI and digital twins, working on challenges that directly impact the advancement of accelerated computing technology.

Join NVIDIA to help build the next generation of distributed AI systems, working with cutting-edge technology and some of the brightest minds in the industry. Your work will directly influence how AI models scale and perform across massive distributed systems, making a real impact on the future of AI computing.

Last updated 8 hours ago

Responsibilities For Deep Learning Engineer - Distributed Task-Based Backends

Develop extensions to popular Deep Learning frameworks for parallelization strategies
Develop compiler optimizations and parallelization heuristics
Develop tools for performance debugging of AI models at large scales
Study and tune Deep Learning training workloads at large scale
Support enterprise customers and partners to scale novel models
Collaborate with Deep Learning software and hardware teams
Contribute to runtime systems development for distributed GPU computing

Requirements For Deep Learning Engineer - Distributed Task-Based Backends

Python

BS, MS or PhD degree in Computer Science, Electrical Engineering or related field
5+ years of relevant industry experience or equivalent academic experience after BS
Proficient with Python and C++ programming
Strong background with parallel and distributed programming, preferably on GPUs
Hands-on development skills using Machine Learning frameworks
Understanding of Deep Learning training in distributed contexts

Benefits For Deep Learning Engineer - Distributed Task-Based Backends

Equity

Equity compensation
Comprehensive benefits package

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins that transform industries.

Santa Clara, CA, USA

$148,000 - $287,500

Machine Learning

Staff Software Engineer

Remote

5,000+ Employees

5+ years of experience

Interested in this job?

Jobs Related To NVIDIA Deep Learning Engineer - Distributed Task-Based Backends

Director, AI Software

NVIDIA

Lead AI software development and team building for NVIDIA's Metropolis manufacturing platform, driving innovation in computer vision and data analytics.

Senior Research Engineer, Foundation Model Training Infrastructure

NVIDIA

Senior Research Engineer role at NVIDIA focusing on building infrastructure for large-scale foundation model training in robotics, offering competitive compensation and the opportunity to work on cutting-edge AI technology.

Senior Deep Learning Performance Architect

NVIDIA

Senior Deep Learning Performance Architect role at NVIDIA focusing on developing advanced processor architectures for AI acceleration, offering competitive compensation and the chance to shape the future of machine learning.

Senior Deep Learning Performance Architect

NVIDIA

Senior Deep Learning Performance Architect position at NVIDIA, developing next-generation AI architectures with competitive compensation and opportunity to work on cutting-edge technology.

Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

Amazon

Senior Machine Learning Compiler Engineer position at AWS Neuron team, focusing on optimizing ML models for AWS Inferentia and Trainium custom chips.