Taro Logo

AI Computing Development Engineer, TensorRT-LLM

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
Machine Learning
Mid-Level Software Engineer
In-Person
5,000+ Employees
2+ years of experience
AI

Description For AI Computing Development Engineer, TensorRT-LLM

NVIDIA, the world leader in accelerated computing, is seeking an AI Computing Development Engineer to join their TensorRT-LLM team. This role sits at the intersection of deep learning and high-performance computing, focusing on building inferencing software for NVIDIA's product lines. The position offers an opportunity to work on cutting-edge AI technologies, including LLMs, ChatGPT, and Generative AI.

The role involves developing and optimizing inferencing software that scales across multiple platforms, conducting performance analysis, and staying current with the latest developments in AI. You'll collaborate with software, research, and product teams across NVIDIA, contributing to the direction of machine learning inferencing technology. The position also offers the opportunity to publish research in scientific conferences.

NVIDIA is renowned as one of technology's most desirable employers, offering the chance to work with some of the industry's brightest minds. You'll be contributing to state-of-the-art AI and Compute systems, gaining exposure to the entire deep learning software stack. The company's work is transforming major industries and having a profound impact on society.

The ideal candidate will bring a strong technical foundation with a Masters or higher degree in a computing-related field, combined with practical experience in C/C++ or Python programming. You'll need to demonstrate expertise in deep learning frameworks and have a passion for staying current with AI developments. This role offers an exciting opportunity to shape the future of AI computing at a global technology leader.

Last updated 9 days ago

Responsibilities For AI Computing Development Engineer, TensorRT-LLM

  • Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
  • Performance analysis, optimization and tuning
  • Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM
  • Provide feedback into the architecture and hardware design and development
  • Collaborate across the company to guide the direction of machine learning inferencing
  • Publish key results in scientific conferences

Requirements For AI Computing Development Engineer, TensorRT-LLM

Python
  • Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree
  • 2+ years of relevant software development experience
  • Excellent C/C++ or Python programming and software design skills
  • Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning
  • Experience working with deep learning frameworks PyTorch, TensorRT-LLM, NeMo, vLLM
  • Proactive and able to work without supervision
  • Excellent written and oral communication skills in English

Interested in this job?

Jobs Related To NVIDIA AI Computing Development Engineer, TensorRT-LLM