Senior Software Engineer, LLM Inference

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Automotive

Description For Senior Software Engineer, LLM Inference

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior Software Engineer specializing in LLM Inference at their Shanghai location. This role sits at the intersection of GPU computing and artificial intelligence, focusing on developing and optimizing inference software for large language models. The position offers an opportunity to work with cutting-edge AI technology, particularly in AI-City and self-driving car applications, utilizing NVIDIA's powerful GPU-accelerated libraries including CUDA, cuDNN, and TensorRT.

The ideal candidate will join a team that's pushing the boundaries of AI implementation, working on solutions that directly impact the future of autonomous vehicles and smart cities. This role requires both technical expertise in C/C++ programming and deep learning frameworks, as well as a strong understanding of the latest developments in AI, particularly in LLMs and generative models.

NVIDIA has a rich history of innovation, from inventing the GPU in 1999 to revolutionizing parallel computing and igniting the modern AI era. The company's mission is to amplify human imagination and intelligence, making this role perfect for someone passionate about advancing AI technology and its real-world applications.

The position offers the chance to work with cross-functional teams, influence the direction of machine learning inferencing, and contribute to NVIDIA's continued leadership in AI and GPU computing. The successful candidate will be part of a company that has continuously reinvented itself and stays at the forefront of technological innovation.

Last updated 6 hours ago

Responsibilities For Senior Software Engineer, LLM Inference

Develop robust inferencing software scalable across multiple platforms
Perform performance analysis, optimization and tuning
Follow academic developments in artificial intelligence and update TensorRT
Collaborate with software, research and product teams on machine learning inferencing direction

Requirements For Senior Software Engineer, LLM Inference

Python

Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree
3+ years of relevant software development experience
Excellent C/C++ programming and software design skills
Strong knowledge of artificial intelligence and deep learning
Experience with deep learning frameworks like PyTorch
Excellent written and oral communication skills in English
Strong customer communication skills
Ability to work independently

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.

Machine Learning

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Automotive

Interested in this job?

Jobs Related To NVIDIA Senior Software Engineer, LLM Inference

Senior Perception Engineer

NVIDIA

Senior Perception Engineer role at NVIDIA developing autonomous driving solutions using deep learning and computer vision, offering competitive salary and opportunity to work on cutting-edge technology.

Senior On-Device Model Inference Optimization Engineer

NVIDIA

Senior AI optimization role at NVIDIA focusing on improving performance and efficiency of AI models for autonomous vehicles, offering competitive salary and equity benefits.

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Senior ML Engineer position at NVIDIA focusing on prediction and planning systems for autonomous vehicles

Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Senior ML Engineer role focusing on prediction and planning systems for autonomous vehicles at NVIDIA.

Senior Software Engineer, LLM Inference

NVIDIA

Senior Software Engineer position focused on Large Language Model (LLM) inference at NVIDIA.