Taro Logo

Infra Engineer

Building a search engine from scratch to serve every AI application, with focus on web crawling, embedding models, and vector databases.
$150,000 - $300,000
DevOps
Senior Software Engineer
In-Person
11 - 50 Employees
5+ years of experience
AI

Job Description

Exa is revolutionizing AI applications by building a comprehensive search engine from the ground up. The company specializes in developing massive-scale infrastructure for web crawling, training cutting-edge embedding models, and creating high-performance vector databases in Rust. With a significant investment in hardware, including a $5M H200 GPU cluster, Exa manages operations involving thousands of machines.

The Infrastructure Team plays a crucial role in developing the foundational tooling and infrastructure that powers all of Exa's systems. They're seeking infrastructure engineers to enhance their engineering capabilities by building sophisticated systems like GPU cluster orchestration in Kubernetes, implementing map-reduce batch jobs on Ray, and creating world-class observability tooling.

This role offers an exciting opportunity to work with cutting-edge technology and scale. You'll be handling projects such as building Kubernetes orchestration for multi-million dollar GPU clusters, scaling AWS batch job systems, and optimizing GPU scheduling for maximum efficiency. The position requires someone with extensive experience in large-scale infrastructure and a meticulous approach to system reliability and optimization.

The position is based in San Francisco, offering a competitive salary range of $150K-$300K plus equity. The company provides visa sponsorship for international candidates (STEM OPT, OPT, H1B, O1, E3), demonstrating their commitment to attracting top talent globally. This is an excellent opportunity for experienced infrastructure engineers who want to work on challenging problems at the intersection of AI and large-scale systems.

Last updated 3 days ago

Responsibilities For Infra Engineer

  • Build kubernetes orchestration on GPU clusters
  • Scale AWS batchjob system to handle map reduce jobs over thousands of machines
  • Design GPU scheduling software for maximum cluster utilization
  • Build observability into production systems

Requirements For Infra Engineer

Kubernetes
Rust
  • Experience designing and operating large-scale infrastructure - GPU clusters or large kubernetes clusters or cloud batchjob systems
  • Obsessive mindset focusing on reliability, observability, and optimization across the entire stack

Benefits For Infra Engineer

Equity
Visa Sponsorship
  • Equity
  • Visa Sponsorship

Related Jobs

Senior System Software Engineer - DevOps and Infrastructure Automation

Senior DevOps Engineer role at NVIDIA focusing on AI infrastructure automation and CI/CD pipeline management, offering competitive compensation and the opportunity to work with cutting-edge technology.

Senior TechOps Engineer

Senior TechOps Engineer position at Jobgether - Remote opportunity based in New York, focusing on technical operations and DevOps practices.

Senior DevOps Engineer (Maryland)

Senior DevOps Engineer position at eSimplicity, focusing on secure cloud solutions for healthcare systems with competitive pay and benefits.

AWS DevOps Engineer (Remote - US)

Remote AWS DevOps Engineer position at Hopin, offering $115-140K salary, comprehensive benefits, and opportunity to work with cutting-edge cloud technologies.

AWS DevOps Engineer (Remote - US)

Remote AWS DevOps Engineer position at Hopin, offering $115-140K salary with comprehensive benefits, focusing on cloud infrastructure, CI/CD, and automation using AWS, Jenkins, and Terraform.