xAI is seeking an AI Engineer & Researcher specializing in Inference to join their mission of creating AI systems that understand the universe. This role combines cutting-edge AI research with practical engineering implementation, focusing on optimizing model inference and building scalable production systems.
The position is based in the Bay Area (San Francisco and Palo Alto) and offers a competitive salary range of $180,000 - $440,000 USD. The team operates with a flat organizational structure where leadership is earned through initiative and excellence.
The role involves working with modern AI technologies including Python, Rust, PyTorch, JAX, and CUDA. A key responsibility includes leading the development of SGLang, one of the most popular open-source inference engines. The successful candidate will focus on optimizing model inference performance, building reliable production systems serving millions of users, and advancing research in scaling test-time compute.
The ideal candidate should have extensive experience in system optimizations for model serving, including batching, caching, and load balancing. Knowledge of low-level optimizations for inference, such as GPU kernels and code generation, is crucial. Experience with algorithmic optimizations like quantization and speculative decoding is also required.
The team values strong communication skills and the ability to share knowledge effectively. The interview process is thorough but efficient, typically completed within a week, consisting of coding assessment, systems hands-on evaluation, project presentation, and team meet-and-greet.
This is an excellent opportunity for a senior-level AI engineer who wants to work on challenging problems at the intersection of theoretical AI research and practical implementation, contributing to systems that will serve millions of users while advancing the field of AI.