Research Infrastructure Engineer - Post-Training

AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity.
$310,000 - $460,000
Backend
Senior Software Engineer
Hybrid
1,000 - 5,000 Employees
5+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:
Software Engineer - Compiler, Kernels, Runtime

Senior Software Engineering role at OpenAI focusing on compiler, kernel, and runtime development for ML infrastructure, offering $310K-$550K plus equity in San Francisco.

Software Engineer, Backend

Senior Backend Software Engineer role at OpenAI working on ChatGPT for Work team, building enterprise solutions with competitive compensation between $245K-$385K

Software Engineer, Backend

Senior Backend Software Engineer role at OpenAI working on ChatGPT platform development and scaling systems.

Software Engineer, Infrastructure - Analytics

Senior Software Engineer position at OpenAI focusing on infrastructure analytics and distributed systems, offering $310K-$460K plus equity in San Francisco with hybrid work model.

Software Engineer – API SDK

Senior Software Engineer position at OpenAI focusing on API SDK development, offering $255K-$385K plus equity, based in San Francisco.

Description For Research Infrastructure Engineer - Post-Training

OpenAI is seeking a Research Infrastructure Engineer to join their post-training team, focusing on transforming large pre-trained models into user-friendly chatbots like ChatGPT. This role combines deep technical expertise with ML systems optimization and distributed systems knowledge. Based in San Francisco with a hybrid work model (3 days in office), the position offers a competitive salary range of $310K-$460K plus equity and comprehensive benefits.

The role involves working across the entire technology stack, from optimizing low-level ML systems to managing job orchestration and data evaluation. You'll be responsible for building cutting-edge infrastructure and tools fundamental to ChatGPT's post-training phase. The team collaborates closely with research groups, creating systems that push the boundaries of what's possible with ChatGPT.

Key responsibilities include ensuring smooth operation of ChatGPT training systems, debugging complex ML codebases, building data management tools, and creating reusable Python libraries. You'll work on projects like profiling large model reinforcement learning training, identifying experiment failures, and redesigning data pipelines for multimodal data.

The ideal candidate should have experience with Python, Kubernetes, distributed infrastructure, GPUs, and large-scale data systems. Knowledge of reinforcement learning and transformers is crucial. While research experience isn't mandatory, experience collaborating with ML researchers in an applied setting is highly valued.

OpenAI offers an exceptional benefits package including medical/dental/vision insurance, mental health support, 401(k) matching, generous parental leave, and learning stipends. The company is committed to diversity, equality, and ensuring AI benefits all of humanity. This is an opportunity to shape the future of AI technology while working with cutting-edge systems and brilliant minds in the field.

Last updated 2 days ago

Responsibilities For Research Infrastructure Engineer - Post-Training

  • Ensure systems powering ChatGPT training and development run smoothly
  • Debug and analyze large ML codebases
  • Build tools for data management, model configuration, and evaluation
  • Create reusable Python libraries with great abstractions
  • Profile and optimize large model reinforcement learning training
  • Identify and address system bottlenecks
  • Redesign data pipelines for multimodal data
  • Build front-end evaluation tooling

Requirements For Research Infrastructure Engineer - Post-Training

Python
Kubernetes
  • Experience working in complex technical environments
  • Experience debugging ML systems
  • Experience with reinforcement learning and transformers
  • Experience with Python
  • Experience with kubernetes / distributed infrastructure
  • Experience with GPUs
  • Experience with large scale data systems (beam or spark)
  • Team player mindset

Benefits For Research Infrastructure Engineer - Post-Training

Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
401k
Parental Leave
Education Budget
Equity
  • Medical, dental, and vision insurance for you and your family
  • Mental health and wellness support
  • 401(k) plan with 50% matching
  • Generous time off and company holidays
  • 24 weeks paid birth-parent leave & 20-week paid parental leave
  • Annual learning & development stipend ($1,500 per year)
  • Equity compensation
  • Relocation assistance available

Interested in this job?