Taro Logo

Senior Deep Learning Software Engineer, Inference

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins that transform industries.
$148,000 - $287,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI

Job Description

NVIDIA is seeking a Senior Deep Learning Software Engineer specializing in inference to join their growing team. This role focuses on designing, building, and optimizing GPU-accelerated software that powers sophisticated AI applications. The position involves working with cutting-edge deep learning frameworks like SGLang and vLLM, which are crucial for efficient large-scale model serving and inference.

The role requires expertise in performance optimization of deep learning models, particularly in LLM and Generative AI domains. You'll be working with state-of-the-art technologies including CUTLASS, OAI Triton, NCCL, and CUDA kernels to implement and optimize model serving pipelines. The position offers an opportunity to contribute to public frameworks and drive performance improvements across NVIDIA's range of accelerators.

As part of NVIDIA, widely recognized as one of technology's most desirable employers, you'll join a team of forward-thinking professionals working on breakthrough technologies. The company offers highly competitive compensation, including a base salary range of $148,000 - $287,500 (depending on level), equity, and comprehensive benefits.

The ideal candidate should have 5+ years of software development experience, strong C/C++ programming skills, and preferably experience with GPU programming and deep learning model optimization. A Masters or PhD in Computer Science, Computer Engineering, or related field is required. This is an excellent opportunity for someone passionate about AI and high-performance computing to make significant contributions to the field of deep learning inference.

Last updated a day ago

Responsibilities For Senior Deep Learning Software Engineer, Inference

  • Performance optimization, analysis, and tuning of DL models in LLM, Multimodal and Generative AI
  • Scale performance of DL models across different architectures and NVIDIA accelerators
  • Contribute features and code to NVIDIA's inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions
  • Work with cross-collaborative teams across frameworks, NVIDIA libraries and inference optimization

Requirements For Senior Deep Learning Software Engineer, Inference

Python
  • Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI)
  • 5+ years of relevant software development experience
  • Excellent C/C++ programming and software design skills
  • Python experience
  • Software Agile skills
  • GPU programming experience (CUDA, OAI TRITON or CUTLASS) is a plus
  • Prior experience with training, deploying or optimizing inference of DL models in production
  • Background with performance modeling, profiling, debug, and code optimization

Benefits For Senior Deep Learning Software Engineer, Inference

Equity
  • Competitive base salary
  • Equity
  • Comprehensive benefits package

Related Jobs

Senior Architecture Energy Modeling Engineer

Senior Architecture Energy Modeling Engineer role at NVIDIA focusing on ML-based power modeling and energy efficiency optimization for GPUs, offering $168K-$310K base salary plus equity.

Senior DFX Software Engineer - Machine Learning

Senior DFX Software Engineer role at NVIDIA focusing on machine learning applications in silicon testing, offering $136K-$264.5K salary plus benefits.

Senior Software Engineer, Agentic AI

Senior Software Engineer position at NVIDIA focusing on developing the Agent Intelligence (AIQ) toolkit for enterprise AI applications, requiring 5+ years of Python experience and expertise in LLM frameworks.

Senior Deep Learning Frameworks Sustaining Engineer

Senior Deep Learning Engineer role at NVIDIA focusing on maintaining and improving machine learning frameworks and enterprise products.

Senior Computer Vision System Performance Engineer

Senior Computer Vision System Performance Engineer role at NVIDIA focusing on optimizing computer vision applications and developing hardware-accelerated pipelines.