AI Computing Development Engineer, TensorRT-LLM

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.

Shanghai, China

Machine Learning

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Description For AI Computing Development Engineer, TensorRT-LLM

NVIDIA, the world leader in accelerated computing, is seeking an AI Computing Development Engineer to join their TensorRT-LLM team. This role sits at the intersection of deep learning and high-performance computing, focusing on building inferencing software for NVIDIA's product lines. The position offers an opportunity to work on cutting-edge AI technologies, including LLMs, ChatGPT, and Generative AI.

The role involves developing and optimizing inferencing software that scales across multiple platforms, conducting performance analysis, and staying current with the latest developments in AI. You'll collaborate with software, research, and product teams across NVIDIA, contributing to the direction of machine learning inferencing technology. The position also offers the opportunity to publish research in scientific conferences.

NVIDIA is renowned as one of technology's most desirable employers, offering the chance to work with some of the industry's brightest minds. You'll be contributing to state-of-the-art AI and Compute systems, gaining exposure to the entire deep learning software stack. The company's work is transforming major industries and having a profound impact on society.

The ideal candidate will bring a strong technical foundation with a Masters or higher degree in a computing-related field, combined with practical experience in C/C++ or Python programming. You'll need to demonstrate expertise in deep learning frameworks and have a passion for staying current with AI developments. This role offers an exciting opportunity to shape the future of AI computing at a global technology leader.

Last updated 9 days ago

Responsibilities For AI Computing Development Engineer, TensorRT-LLM

Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
Performance analysis, optimization and tuning
Closely follow academic developments in the field of artificial intelligence and feature update TensorRT-LLM
Provide feedback into the architecture and hardware design and development
Collaborate across the company to guide the direction of machine learning inferencing
Publish key results in scientific conferences

Requirements For AI Computing Development Engineer, TensorRT-LLM

Python

Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree
2+ years of relevant software development experience
Excellent C/C++ or Python programming and software design skills
Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning
Experience working with deep learning frameworks PyTorch, TensorRT-LLM, NeMo, vLLM
Proactive and able to work without supervision
Excellent written and oral communication skills in English

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.

Shanghai, China

Machine Learning

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Interested in this job?

AI Computing Development Engineer, TensorRT-LLM

NVIDIA

Description For AI Computing Development Engineer, TensorRT-LLM

Responsibilities For AI Computing Development Engineer, TensorRT-LLM

Requirements For AI Computing Development Engineer, TensorRT-LLM

NVIDIA

Jobs Related To NVIDIA AI Computing Development Engineer, TensorRT-LLM