ML Engineer — LLM Evaluation

Dynamo AI

AI startup focused on developing safe, private, and responsible LLMs for Fortune 500 companies

San Francisco, CA, USA • New York, NY, USA • London, UK…

Machine Learning

Mid-Level Software Engineer

Remote

51 - 100 Employees

3+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For ML Engineer — LLM Evaluation

Dynamo AI is at the forefront of developing safe, private, and responsible Large Language Models (LLMs) for enterprise applications. As a 2023 CB Insights Top 100 AI Startup, they're focused on democratizing AI advancements responsibly while maintaining user privacy and safety.

The ML Engineer position for LLM Evaluation offers a unique opportunity to work with a team of ML Ph.D.s and builders in a fast-paced environment, free from traditional corporate bureaucracy. The role involves developing and implementing cutting-edge evaluation processes for LLMs, focusing on real-world applications and safety considerations.

This position is perfect for candidates passionate about responsible AI development who want to see their impact within weeks rather than years. You'll be working on the premier platform for private and personalized LLMs, helping Fortune 500 companies adopt frontier research for their next generation of LLM products.

Key aspects of the role include owning LLM evaluation processes, generating synthetic data, conducting benchmarking, and delivering production-ready code. You'll also have the opportunity to contribute to academic research through papers and patents, working directly with the research team.

The ideal candidate should have strong domain knowledge in LLM evaluation, experience with benchmarking implementations, and the ability to adapt quickly to new research developments. This role offers the chance to make a significant impact on how LLMs are evaluated for safety and effectiveness in real-world applications.

Working at Dynamo AI means joining a team committed to challenging the status quo of AI development, where privacy and responsibility are not sacrificed for advancement. The position offers flexibility with remote work options and multiple office locations globally, making it an excellent opportunity for those looking to make a meaningful contribution to the future of responsible AI development.

Last updated 3 months ago

Responsibilities For ML Engineer — LLM Evaluation

Own LLM evaluation processes and methods with a focus on generating benchmarks representative of real-world usage and safety vulnerabilities
Generate high quality synthetic data, curate labels, and conduct rigorous benchmarking
Deliver robust, scalable, and reproducible production code
Develop methods for benchmarking LLMs for harmlessness and helpfulness
Co-author papers, patents, and presentations with research team

Requirements For ML Engineer — LLM Evaluation

Python

Domain knowledge in LLM evaluation and data curation techniques
Extensive experience in designing and implementing LLM benchmarking
Ability to lead end-to-end projects
Adaptability and flexibility to learn and implement state-of-the-art research
Preferred: past research or projects in benchmarking LLMs