Taro Logo

Senior Performance Software Engineer, Deep Learning Libraries

NVIDIA is the world leader in accelerated computing, pioneering solutions for AI and digital twins.
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
2+ years of experience
AI

Description For Senior Performance Software Engineer, Deep Learning Libraries

NVIDIA is seeking a Senior Performance Software Engineer to join their Deep Learning Libraries team. This role focuses on developing optimized code to accelerate linear algebra and deep learning operations on NVIDIA GPUs. The position involves working with cutting-edge technologies like cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models. The successful candidate will be instrumental in enabling breakthroughs in image classification, speech recognition, and natural language processing.

The role requires expertise in writing highly optimized compute kernels using C++ CUDA, with a focus on core deep learning operations such as matrix multiplies, convolutions, and normalizations. You'll be working at the lower levels of the deep learning software stack, directly interfacing with GPU hardware. Collaboration is key, as you'll work across multiple NVIDIA teams including the CUDA compiler team, deep learning performance teams, and hardware architecture teams.

This is an excellent opportunity for someone with strong parallel programming experience and a deep understanding of computer architecture. The ideal candidate should have at least 2 years of industry experience and advanced education in Computer Science or related fields. Experience with CUDA/OpenCL GPU programming, numerical methods, and linear algebra would be particularly valuable.

Join NVIDIA, the world leader in accelerated computing, and be part of a team that's driving the revolution in artificial intelligence. You'll be working on projects that directly impact the performance of AI applications worldwide, using cutting-edge technology and contributing to open-source projects like CUTLASS.

Last updated a day ago

Responsibilities For Senior Performance Software Engineer, Deep Learning Libraries

  • Writing highly tuned compute kernels in C++ CUDA for core deep learning operations
  • Following software engineering best practices including regression testing and CI/CD flows
  • Collaborating with CUDA compiler team on optimal assembly code
  • Working with deep learning training and inference performance teams
  • Collaborating with hardware and architecture teams on programming models

Requirements For Senior Performance Software Engineer, Deep Learning Libraries

  • Masters or PhD degree or equivalent experience in Computer Science, Computer Engineering, Applied Math, or related field
  • 2+ years of relevant industry experience
  • Strong C++ programming and software design skills
  • Experience with performance-oriented parallel programming
  • Solid understanding of computer architecture and assembly programming

Interested in this job?

Jobs Related To NVIDIA Senior Performance Software Engineer, Deep Learning Libraries