NVIDIA is seeking a Senior Deep Learning Software Engineer specializing in inference to join their growing team. This role focuses on designing, building, and optimizing GPU-accelerated software that powers sophisticated AI applications. The position involves working with open-source frameworks and tools like CUTLASS, OAI Triton, NCCL, and CUDA kernels to implement and optimize model serving pipelines.
The ideal candidate will have strong expertise in C/C++ programming, deep learning model optimization, and GPU programming. They will work closely with the deep learning community to implement cutting-edge algorithms for public release in inference frameworks. The role involves identifying and driving performance improvements for state-of-the-art LLM and Generative AI models across NVIDIA's range of accelerators.
NVIDIA offers highly competitive compensation, with base salary ranging from $148,000 to $287,500 depending on level and experience, plus equity and comprehensive benefits. The company is known for being one of the technology world's most desirable employers, with forward-thinking teams and outstanding growth opportunities.
The position is based in Santa Clara, CA, and requires at least 5 years of relevant software development experience. Key responsibilities include performance optimization of deep learning models, scaling solutions across different architectures, and contributing to NVIDIA's inference libraries. The role offers the opportunity to work at the forefront of AI technology, implementing solutions that power the next generation of AI applications.
NVIDIA values diversity and maintains an inclusive work environment, providing equal opportunities to all qualified candidates. The company's work in AI and digital twins is transforming major industries and making a significant societal impact. This role presents an excellent opportunity for those passionate about deep learning and high-performance computing to contribute to groundbreaking developments in AI technology.