Zyphra, a Palo Alto-based AI company, is seeking a Machine Learning Engineer specializing in vision to join their team building Maia, a cutting-edge multimodal agent system. The role focuses on developing next-generation vision-language models for understanding natural scenes, particularly in web, desktop, and mobile UIs. The ideal candidate will contribute to large-scale vision encoder training, performance optimization, and dataset management.
The company brings together talent from leading AI organizations including Google DeepMind, Anthropic, StabilityAI, and others. They value both deep research and engineering excellence, embracing new ideas and maintaining a fast-paced, impact-driven culture. The team strongly emphasizes making grounded, methodical steps toward ambitious goals.
The position requires strong research intuition, implementation skills, and the ability to work effectively in a collaborative environment. Experience with vision language models, large-scale datasets, and deep learning frameworks like PyTorch is highly valued. The role offers comprehensive benefits including competitive compensation, healthcare, and equity, plus unique perks like on-site meals and regular team events.
This is an in-person role at their Palo Alto headquarters, ideal for candidates passionate about AI who can contribute to both research and engineering implementation at scale. The company offers visa sponsorship for exceptional candidates and maintains a culture where both crazy ideas and methodical execution are celebrated.