Taro Logo

Senior Backend Engineer, Inference Platform

Research-driven artificial intelligence company building open and transparent AI systems, focusing on lowering the cost of modern AI systems.
$160,000 - $250,000
Backend
Senior Software Engineer
Hybrid
51 - 100 Employees
5+ years of experience
AI

Job Description

Together AI is seeking a Senior Backend Engineer to join their Inference Platform team. This role offers an exciting opportunity to work at the cutting edge of AI infrastructure, optimizing performance for some of the most advanced generative AI models in the world. You'll be working hands-on with state-of-the-art hardware including H100s, H200s, and GB200s GPUs, solving complex challenges in distributed systems and high-performance computing.

The position involves building critical infrastructure components that power Together AI's frontier models, including global request routing, auto-scaling systems, and multi-tenant traffic shaping. You'll be working directly with research teams to productionize breakthrough AI models and contributing to the open source community through projects like SGLang and vLLM.

The ideal candidate brings 5+ years of experience in distributed systems, with deep expertise in system optimization and scalability. Strong programming skills in languages like Rust, Go, Python, or TypeScript are essential, while knowledge of AI/ML systems and GPU computing is highly valued. The role offers a unique blend of technical challenges, from millisecond-level latency optimization to large-scale resource management across global data centers.

Together AI offers a competitive compensation package including equity and benefits, with a base salary range of $160,000 - $250,000. The company culture emphasizes technical ownership and high impact, where your work directly contributes to making advanced AI models more accessible and efficient. This is a hybrid role requiring three days per week in the San Francisco office.

As part of a research-driven AI company committed to open and transparent AI systems, you'll be at the forefront of technological advancement, working with a team that has contributed to significant innovations like FlashAttention, Hyena, and RedPajama. This is an opportunity to shape the future of AI infrastructure while working with cutting-edge technology and world-class researchers.

Last updated 3 days ago

Responsibilities For Senior Backend Engineer, Inference Platform

  • Build and optimize global and local request routing for low-latency load balancing
  • Develop auto-scaling systems for resource allocation across data centers
  • Design multi-tenant traffic shaping systems
  • Engineer trade-offs between latency and throughput
  • Optimize prefix caching to reduce model compute
  • Collaborate with ML researchers on new model architectures
  • Profile and analyze system-level performance

Requirements For Senior Backend Engineer, Inference Platform

Python
Go
TypeScript
Kubernetes
  • 5+ years experience building large-scale distributed systems and API microservices
  • Strong background in system efficiency, scalability, and stability
  • Excellent understanding of low-level OS concepts
  • Expert-level programming in Rust, Go, Python, or TypeScript
  • Bachelor's or Master's degree in Computer Science or related field
  • Knowledge of modern LLMs and generative models (plus)
  • Experience with Kubernetes or container orchestration (plus)
  • Familiarity with GPU software stacks and HPC technologies (plus)

Benefits For Senior Backend Engineer, Inference Platform

Medical Insurance
Equity
  • Competitive compensation
  • Equity
  • Health insurance
  • Other competitive benefits

Related Jobs