Taro Logo

Machine Learning Engineer, Training Infrastructure

Hedra is a pioneering generative media company building Hedra Studio, a multimodal creation platform with Character-3 foundation model for AI-driven content creation.
Machine Learning
Senior Software Engineer
In-Person
11 - 50 Employees
3+ years of experience
AI

Job Description

Hedra, a pioneering generative media company backed by top investors including Index, A16Z, and Abstract Ventures, is seeking a Machine Learning Engineer to join their team. The company is building Hedra Studio, a groundbreaking multimodal creation platform powered by their Character-3 foundation model - the first omnimodal model in production that jointly processes image, text, and audio for intelligent video generation.

The role focuses on managing and optimizing computational infrastructure for training and deploying machine learning models, particularly 3DVAE and video diffusion models. The ideal candidate should have 3+ years of experience in high-performance computing systems and expertise in managing ML workloads at scale.

Key responsibilities include designing scalable computing solutions, managing cloud infrastructure, optimizing system performance, and ensuring the platform can handle resource-intensive tasks associated with training large generative models. The position requires strong technical skills in cloud computing platforms, containerization technologies, and distributed training techniques.

The company offers a collaborative, in-person work environment in San Francisco with a team passionate about transforming content creation. Benefits include competitive compensation with equity, healthcare coverage, 401k, and office perks. This is an opportunity to join a cutting-edge AI company working on next-generation content creation technology.

Last updated 3 days ago

Responsibilities For Machine Learning Engineer, Training Infrastructure

  • Design, implement, and maintain scalable computing solutions for training and deploying ML models
  • Manage and optimize the performance of computing clusters or cloud instances
  • Ensure infrastructure can handle resource-intensive tasks for training large generative models
  • Monitor system performance and implement improvements using tools like Airflow
  • Collaborate across research teams to understand computational needs and provide solutions

Requirements For Machine Learning Engineer, Training Infrastructure

Kubernetes
  • Bachelor's degree in Computer Science, Information Technology, or related field
  • Experience with cloud computing platforms (AWS, Google Cloud, or Microsoft Azure)
  • Knowledge of containerization technologies (Docker and Kubernetes)
  • Understanding of distributed training techniques
  • Strong problem-solving and communication skills
  • Values engineering processes and version control (CI/CD)

Benefits For Machine Learning Engineer, Training Infrastructure

401k
Medical Insurance
Vision Insurance
Dental Insurance
  • Competitive compensation + equity
  • 401k (no match)
  • Healthcare (Silver PPO Medical, Vision, Dental)
  • Lunch and snacks at the office

Related Jobs