Taro Logo

GenAI Eval Software Engineer

Databricks is the data and AI company that enables data teams to solve the world's toughest problems by building and running the world's best data and AI infrastructure platform.
$142,200 - $204,600
Machine Learning
Mid-Level Software Engineer
In-Person
5,000+ Employees
2+ years of experience
AI · Enterprise SaaS

Description For GenAI Eval Software Engineer

At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the world's best data and AI infrastructure platform so our customers can use deep data insights to improve their business.

The Applied AI team at Databricks sits at the forefront of advancing GenAI-powered products. We've launched Databricks Assistant and AI/BI Genie, which are used by hundreds of thousands of Databricks users daily. As a GenAI Eval Software Engineer, you'll be joining our team to build the GenAI Evaluation System — a critical foundation that ensures the quality, reliability, and performance of AI-driven products at scale.

In this role, you'll design and develop systems to evaluate Large Language Models (LLMs) and GenAI applications, enabling rapid experimentation, robust benchmarking, and continuous improvement across our AI products. You'll collaborate with ML engineers, product teams, and researchers to define what "good" looks like for GenAI Applications, driving impact across Databricks's products.

You'll be working at the forefront of GenAI, shaping how AI quality is defined and delivered at scale, while collaborating with world-class engineers and ML experts in a fast-paced, innovative environment. This is an opportunity to contribute to impactful AI products used by enterprises worldwide and be part of a company built on open-source, transparency, and a strong engineering culture.

The role offers competitive compensation with a base salary range of $142,200 — $204,600 USD, along with comprehensive benefits including medical, dental, vision insurance, 401k, equity, and more. Join us in building the future of AI evaluation and quality assurance at scale.

Last updated a minute ago

Responsibilities For GenAI Eval Software Engineer

  • Design, build, and maintain scalable infrastructure for evaluating LLMs and GenAI-powered features
  • Develop automated testing, benchmarking, and monitoring frameworks to measure model quality and reliability
  • Collaborate closely with ML engineers, product managers, and researchers to define evaluation metrics and methodologies
  • Enable rapid iteration by building tools that support A/B testing, human-in-the-loop evaluations, and dataset management
  • Contribute to the evolution of best practices for GenAI evaluation across diverse use cases

Requirements For GenAI Eval Software Engineer

Python
Java
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience)
  • 2+ years of industry experience in software engineering, preferably in infrastructure, platforms, or ML tooling
  • Strong coding skills in languages such as Python, Scala, or Java
  • Experience building scalable backend systems, distributed systems, or developer platforms
  • Familiarity with SQL, machine learning workflows, LLMs, or AI evaluation concepts is a plus
  • Strong problem-solving skills and a collaborative mindset

Benefits For GenAI Eval Software Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Equity
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • 401k
  • Equity

Interested in this job?

Jobs Related To Databricks GenAI Eval Software Engineer

AI/ML Engineer, Responsible AI

AI/ML Engineer position at GSK focusing on implementing responsible AI methods in drug discovery and healthcare applications, combining technical expertise with ethical considerations.

AI Engineer

AI Engineer position at Aviva focusing on developing and deploying generative AI solutions, offering £39,200-£72,900 salary with hybrid working across UK locations.

GenAI Platform Engineer II

GenAI Platform Engineer II position at GSK, focusing on developing advanced AI capabilities and LLM systems for healthcare innovation.

Python Developer

Python Developer position at TD SYNNEX focusing on AI development, including generative AI models and AI Agents, based in Petaling Jaya, Malaysia.

Python Developer

Python Developer position at TD SYNNEX focusing on AI/ML development, including generative AI and autonomous agents, based in Petaling Jaya, Malaysia.