Taro Logo

Staff ML Engineer - Infrastructure

Building AI-powered solutions for modern silicon chip design, backed by top investors and serving Fortune 100s and AI silicon startups.
Machine Learning
Staff Software Engineer
In-Person
11 - 50 Employees
5+ years of experience
AI

Description For Staff ML Engineer - Infrastructure

ChipStack is revolutionizing the way modern silicon chips are designed through AI-powered solutions. As a Staff ML Engineer in Infrastructure, you'll join a founding team backed by top investors including Khosla Ventures, Cerberus, and Clear Ventures. The role focuses on building core infrastructure for training, fine-tuning, evaluation, and deployment of LLMs across cloud and on-premise environments.

You'll work alongside experienced chip designers, ML scientists who have trained LLMs at scale, and top-tier infrastructure engineers. The team has deep roots at companies like Qualcomm, Nvidia, Google, Meta, and the Allen Institute for AI. This position offers a unique opportunity to apply ML and data infrastructure expertise to complex chip design challenges.

The ideal candidate is startup-oriented, self-motivated, and thrives in dynamic environments. You should be comfortable working independently, tackling difficult problems, and exploring new territories. The role requires strong expertise in Python, ML frameworks, distributed training, and cloud platforms, with experience in managing GPU/TPU workloads.

ChipStack's culture emphasizes challenging the status quo, collaborative learning, fast shipping with high quality, and attention to detail. The company has already deployed with 10+ innovative customers, from Fortune 100s to cutting-edge AI silicon startups, making this an exciting opportunity to make a significant impact in the semiconductor industry.

Last updated a day ago

Requirements For Staff ML Engineer - Infrastructure

Python
Kubernetes
  • 5+ years of experience in ML infrastructure or adjacent roles
  • Deep expertise in Python and experience with training frameworks like PyTorch or TensorFlow
  • Strong systems engineering skills and experience with distributed training, data pipelines, and performance optimization
  • Experience deploying ML models to production (REST APIs, batch jobs, streaming pipelines)
  • Proficiency with cloud platforms (e.g., GCP, AWS) and containerized systems (Docker, Kubernetes)
  • Experience managing GPU/TPU workloads efficiently
  • Good communication skills and the ability to work directly with engineers and customers

Interested in this job?

Jobs Related To ChipStack Staff ML Engineer - Infrastructure