AI Engineer & Researcher - Inference

xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.

San Francisco, CA, USA • Palo Alto, CA, USA

$180,000 - $440,000

Machine Learning

Senior Software Engineer

In-Person

11 - 50 Employees

5+ years of experience

This job posting is no longer active. 😔

Job Description

xAI is seeking an AI Engineer & Researcher specializing in Inference to join their mission of creating AI systems that understand the universe. This role combines cutting-edge AI research with practical engineering implementation, focusing on optimizing model inference and building scalable production systems.

The position is based in the Bay Area (San Francisco and Palo Alto) and offers a competitive salary range of $180,000 - $440,000 USD. The team operates with a flat organizational structure where leadership is earned through initiative and excellence.

The role involves working with modern AI technologies including Python, Rust, PyTorch, JAX, and CUDA. A key responsibility includes leading the development of SGLang, one of the most popular open-source inference engines. The successful candidate will focus on optimizing model inference performance, building reliable production systems serving millions of users, and advancing research in scaling test-time compute.

The ideal candidate should have extensive experience in system optimizations for model serving, including batching, caching, and load balancing. Knowledge of low-level optimizations for inference, such as GPU kernels and code generation, is crucial. Experience with algorithmic optimizations like quantization and speculative decoding is also required.

The team values strong communication skills and the ability to share knowledge effectively. The interview process is thorough but efficient, typically completed within a week, consisting of coding assessment, systems hands-on evaluation, project presentation, and team meet-and-greet.

This is an excellent opportunity for a senior-level AI engineer who wants to work on challenging problems at the intersection of theoretical AI research and practical implementation, contributing to systems that will serve millions of users while advancing the field of AI.

Last updated 5 months ago

Responsibilities For AI Engineer & Researcher - Inference

Optimizing the latency and throughput of model inference
Building reliable production serving systems to serve millions of users
Accelerating research on scaling test-time compute
Leading the development of SGLang open-source inference engine

Requirements For AI Engineer & Researcher - Inference

Python

Kubernetes

Experience with system optimizations for model serving (batching, caching, load balancing, model parallelism)
Experience with low-level optimizations for inference (GPU kernels and code generation)
Experience with algorithmic optimizations for inference (quantization, distillation, speculative decoding)
Experience with large-scale, high concurrent production serving
Experience with testing, benchmarking, and reliability of inference services
Strong communication skills

xAI