Research Engineer - Multimodal Reasoning, SIML

Apple

Apple is a technology company that creates innovative products and experiences that enrich people's lives.

Cupertino, CA, USA

$143,100 - $264,200

Machine Learning

Staff Software Engineer

In-Person

5,000+ Employees

5+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Research Engineer - Multimodal Reasoning, SIML

Apple's Scene Understanding team within the Intelligence System Experience (ISE) organization is seeking a senior technical leader to work on cutting-edge multimodal machine learning research and development. This role sits at the intersection of research and product development, working on technologies that power experiences like Image Playground, Genmoji, Generative Memories, and Semantic Search.

The position involves architecting and deploying production-scale multimodal ML systems, with a focus on training and adapting large language models for compelling user experiences. The successful candidate will lead diverse cross-functional efforts spanning ML modeling, prototyping, validation and private learning.

The ISE team has a strong track record of delivering impactful ML-powered features across Apple's ecosystem, including Spotlight Search, Photos Memories, Generative Playgrounds, Stickers, and Smart wallpapers. The team specializes in scaling production ML workflows through distributed training and optimizing LLMs for on-device experiences.

Key responsibilities include:

Training large-scale multimodal (2D/3D vision-language) models using distributed systems
Deploying efficient neural architectures on devices
Developing privacy-preserving personalization approaches
Ensuring model quality, fairness and robustness
Enriching multimodal capabilities of large language models
Aligning visual content with language models for interactive experiences

The role requires close collaboration with ML researchers, software engineers, hardware teams and designers across Apple. The ideal candidate will have deep expertise in applied ML research, particularly at the intersection of computer vision and natural language processing.

This is an opportunity to shape the future of how users interact with Apple devices through advanced AI capabilities, while maintaining Apple's high standards for privacy and user experience. The position offers competitive compensation, comprehensive benefits, and the chance to work on technologies that will impact billions of users.

The team's work has been featured in several Apple Machine Learning Research publications, demonstrating their commitment to advancing the field while delivering practical applications.

Last updated 2 days ago

Responsibilities For Research Engineer - Multimodal Reasoning, SIML

Training large scale multimodal models on distributed backends
Deployment of compact neural architectures efficiently on device
Learning policies that can be personalized to users in privacy preserving manner
Ensuring quality, fairness and model robustness
Enriching multimodal capabilities of large language models
Aligning image/video content to LMs for visual actions & multi-turn interactions

Requirements For Research Engineer - Multimodal Reasoning, SIML

Python

M.S. or PhD in Computer Science or related field
Hands on experience training LLMs/adapting pre-trained LLMs
Modeling experience at intersection of NLP and vision
Proficiency in ML toolkit (e.g., PyTorch)
Strong programming skills in Python

Benefits For Research Engineer - Multimodal Reasoning, SIML

Medical Insurance

Dental Insurance

Vision Insurance

401k

Equity

Education Budget

Comprehensive medical and dental coverage
Retirement benefits
Employee stock programs
Discretionary restricted stock unit awards
Employee Stock Purchase Plan
Education reimbursement
Discretionary bonuses
Relocation assistance