Together AI is seeking a Senior Backend Engineer to join their Inference Platform team. This role offers an exciting opportunity to work at the cutting edge of AI infrastructure, optimizing performance for some of the most advanced generative AI models in the world. You'll be working hands-on with state-of-the-art hardware including H100s, H200s, and GB200s GPUs, solving complex challenges in distributed systems and high-performance computing.
The position involves building critical infrastructure components that power Together AI's frontier models, including global request routing, auto-scaling systems, and multi-tenant traffic shaping. You'll be working directly with research teams to productionize breakthrough AI models and contributing to the open source community through projects like SGLang and vLLM.
The ideal candidate brings 5+ years of experience in distributed systems, with deep expertise in system optimization and scalability. Strong programming skills in languages like Rust, Go, Python, or TypeScript are essential, while knowledge of AI/ML systems and GPU computing is highly valued. The role offers a unique blend of technical challenges, from millisecond-level latency optimization to large-scale resource management across global data centers.
Together AI offers a competitive compensation package including equity and benefits, with a base salary range of $160,000 - $250,000. The company culture emphasizes technical ownership and high impact, where your work directly contributes to making advanced AI models more accessible and efficient. This is a hybrid role requiring three days per week in the San Francisco office.
As part of a research-driven AI company committed to open and transparent AI systems, you'll be at the forefront of technological advancement, working with a team that has contributed to significant innovations like FlashAttention, Hyena, and RedPajama. This is an opportunity to shape the future of AI infrastructure while working with cutting-edge technology and world-class researchers.