Grafana Labs is seeking a Senior Software Engineer specializing in GenAI & ML Evaluation Frameworks to join their AI teams. This role is crucial in helping users understand and improve their systems through AI-driven features. The position focuses on building and evolving internal evaluation frameworks for Generative AI systems, particularly Large Language Models (LLMs).
The role involves designing and scaling automated evaluation pipelines, integrating them into CI/CD workflows, and defining metrics that align with both product goals and model behavior. You'll be working on implementing robust evaluation frameworks, developing tooling for automated assessment of model outputs, and leading dataset management processes.
Grafana Labs is a leader in observability tools, with their open-source visualization tool used by over 20M users globally. Their tools help users monitor everything from beehives to climate change, and their technology stack is used by major companies including Bloomberg, JPMorgan Chase, and eBay.
The ideal candidate should have strong experience in evaluating AI/ML systems, familiarity with prompt engineering, and the ability to work autonomously. You'll be joining a company that values pragmatic approaches, reproducibility, and thoughtful trade-offs when scaling GenAI systems.
This is a remote position based in the United States, offering competitive compensation between $148,505 - $178,206, along with equity and comprehensive benefits. The role provides an opportunity to shape the future of AI-driven observability tools while working with a team passionate about reducing human toil and building supportive AI systems.