Taro Logo

Research Engineer - Multimodal Reasoning, SIML

Apple is a technology company that creates innovative products and experiences that enrich people's lives.
$143,100 - $264,200
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Research Engineer - Multimodal Reasoning, SIML

Apple's Scene Understanding team within the Intelligence System Experience (ISE) organization is seeking a senior technical leader to work on cutting-edge multimodal machine learning research and development. This role sits at the intersection of research and product development, working on technologies that power experiences like Image Playground, Genmoji, Generative Memories, and Semantic Search.

The position involves architecting and deploying production-scale multimodal ML systems, with a focus on training and adapting large language models for compelling user experiences. The successful candidate will lead diverse cross-functional efforts spanning ML modeling, prototyping, validation and private learning.

The ISE team has a strong track record of delivering impactful ML-powered features across Apple's ecosystem, including Spotlight Search, Photos Memories, Generative Playgrounds, Stickers, and Smart wallpapers. The team specializes in scaling production ML workflows through distributed training and optimizing LLMs for on-device experiences.

Key responsibilities include:

  • Training large-scale multimodal (2D/3D vision-language) models using distributed systems
  • Deploying efficient neural architectures on devices
  • Developing privacy-preserving personalization approaches
  • Ensuring model quality, fairness and robustness
  • Enriching multimodal capabilities of large language models
  • Aligning visual content with language models for interactive experiences

The role requires close collaboration with ML researchers, software engineers, hardware teams and designers across Apple. The ideal candidate will have deep expertise in applied ML research, particularly at the intersection of computer vision and natural language processing.

This is an opportunity to shape the future of how users interact with Apple devices through advanced AI capabilities, while maintaining Apple's high standards for privacy and user experience. The position offers competitive compensation, comprehensive benefits, and the chance to work on technologies that will impact billions of users.

The team's work has been featured in several Apple Machine Learning Research publications, demonstrating their commitment to advancing the field while delivering practical applications.

Last updated 2 days ago

Responsibilities For Research Engineer - Multimodal Reasoning, SIML

  • Training large scale multimodal models on distributed backends
  • Deployment of compact neural architectures efficiently on device
  • Learning policies that can be personalized to users in privacy preserving manner
  • Ensuring quality, fairness and model robustness
  • Enriching multimodal capabilities of large language models
  • Aligning image/video content to LMs for visual actions & multi-turn interactions

Requirements For Research Engineer - Multimodal Reasoning, SIML

Python
  • M.S. or PhD in Computer Science or related field
  • Hands on experience training LLMs/adapting pre-trained LLMs
  • Modeling experience at intersection of NLP and vision
  • Proficiency in ML toolkit (e.g., PyTorch)
  • Strong programming skills in Python

Benefits For Research Engineer - Multimodal Reasoning, SIML

Medical Insurance
Dental Insurance
Vision Insurance
401k
Equity
Education Budget
  • Comprehensive medical and dental coverage
  • Retirement benefits
  • Employee stock programs
  • Discretionary restricted stock unit awards
  • Employee Stock Purchase Plan
  • Education reimbursement
  • Discretionary bonuses
  • Relocation assistance

Interested in this job?