NVIDIA is seeking a Senior Performance Software Engineer to join their Deep Learning Libraries team. This role focuses on developing optimized code to accelerate linear algebra and deep learning operations on NVIDIA GPUs. The position involves working with cutting-edge technologies like cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models. The successful candidate will be instrumental in enabling breakthroughs in image classification, speech recognition, and natural language processing.
The role requires expertise in writing highly optimized compute kernels using C++ CUDA, with a focus on core deep learning operations like matrix multiplies, convolutions, and normalizations. You'll be working closely with various teams across NVIDIA, including the CUDA compiler team, deep learning performance teams, and hardware architecture teams. This position is particularly exciting as it deals with code lower in the deep learning software stack, right down to the GPU hardware level.
NVIDIA offers a competitive compensation package with a base salary range of $184,000 - $425,500 USD, plus equity and benefits. The company is known for being one of the technology world's most desirable employers, offering opportunities to work on groundbreaking projects in AI and accelerated computing. They're committed to fostering a diverse work environment and are proud to be an equal opportunity employer.
The ideal candidate will have at least 6 years of relevant industry experience, strong C++ programming skills, and experience with parallel programming. Additional expertise in CUDA/OpenCL GPU programming, numerical methods, linear algebra, or tools like LLVM and TensorFlow MLIR would be particularly valuable. This is an opportunity to join a team that's building the fundamental software powering the AI revolution worldwide.