Production Systems Engineer, AI Systems

Meta builds technologies that help people connect, find communities, and grow businesses, including apps like Facebook, Messenger, Instagram, and WhatsApp.
$163,000 - $225,000
Backend
Senior Software Engineer
In-Person
5,000+ Employees
6+ years of experience
AI

Description For Production Systems Engineer, AI Systems

Meta is seeking an experienced Systems Engineer to join their Release to Production (RTP) team working on the Meta Training and Inference Accelerator (MTIA) program. This role is crucial for Meta's AI/ML initiatives, supporting large-scale AI Training and Inference operations. The position focuses on the end-to-end Hardware Lifecycle of Meta's servers, including prototyping, debugging, and system monitoring.

The ideal candidate will work on scale up and scale out network technologies, particularly RDMA NIC, for MTIA systems that power Meta's AI advancements. This role requires deep knowledge of network protocols (TCP/IP, RDMA) and hands-on experience with post-Silicon validation for networking platforms.

As a Production Systems Engineer, you'll collaborate with various teams including hardware designers, networking teams, system manufacturers, and data center operations teams. You'll be responsible for system validation, troubleshooting, and ensuring the successful deployment of new platforms into Meta's fleet.

The position offers competitive compensation ranging from $163,000 to $225,000 per year, plus bonus and equity opportunities. This is an excellent opportunity for experienced engineers who want to work at the intersection of hardware systems and AI technology at one of the world's leading tech companies.

Meta provides a comprehensive benefits package and promotes an inclusive work environment, being an Equal Employment Opportunity employer. The role is based in Austin, TX, and offers the chance to work on cutting-edge AI infrastructure that powers Meta's innovative services.

Last updated 17 hours ago

Responsibilities For Production Systems Engineer, AI Systems

  • Support new MTIA platform introduction into Meta fleet by working closely with post-silicon validation team
  • Proactively create experiments and tooling to detect, reproduce and diagnose hardware/firmware/software health issues
  • Develop understanding of AI workload traffic and incorporate as part of NPI
  • Contribute to enabling hacks for future technology explorations in AI space
  • Troubleshoot, diagnose and root cause system failures
  • Develop visibility through data visualization and implement systemic solutions
  • Leverage production experience to drive continuous product quality improvement

Requirements For Production Systems Engineer, AI Systems

Linux
  • Bachelor's degree in Engineering or Computer Science
  • 6+ years of work experience in Network ASIC development, Network Product deployment, or Interconnect Technologies
  • Knowledge of server architecture and components
  • Experience working with Linux
  • Knowledge of TCP/IP and experience using iperf
  • Hands on troubleshooting and debug experience

Benefits For Production Systems Engineer, AI Systems

Medical Insurance
Equity
  • Bonus
  • Equity
  • Medical Insurance

Interested in this job?

Jobs Related To Meta Production Systems Engineer, AI Systems

Production Systems Engineer, AI Systems

Senior Production Systems Engineer role at Meta focusing on AI systems infrastructure, hardware lifecycle management, and network technologies optimization.

Manufacturing Test Engineer

Senior Manufacturing Test Engineer role at Meta developing and implementing test modules for Open Compute hardware manufacturing.

Data Center Systems Engineer

Senior Data Center Systems Engineer role at Meta, focusing on hardware and infrastructure optimization, offering $170K-$240K plus benefits.

Submarine Cable Systems Engineer

Lead submarine cable systems engineering at Meta, designing and implementing global subsea network infrastructure for the world's largest social technology company.

Business Support Engineer

Senior Business Support Engineer role at Meta, providing technical support and integration solutions for global partners in messaging, telecommunications, and fintech domains.