We are seeking a highly motivated and skilled Applied Research Engineer to join our team in the Video Computer Vision org. This centralized applied research and engineering organization is responsible for developing real-time on-device Computer Vision and Machine Perception technologies across Apple products. The ideal candidate will have a strong background in developing and exploring multimodal large language models that integrate various types of data such as text, image, video, and audio.
Key responsibilities include: • Conducting research and development on multimodal large language models, focusing on exploring and utilizing diverse data modalities • Designing, implementing, and evaluating algorithms and models to enhance the performance and capabilities of our AI systems • Collaborating with cross-functional teams, including researchers, data scientists, and software engineers, to translate research into practical applications • Staying up-to-date with the latest advancements in AI, machine learning, and computer vision, and applying this knowledge to drive innovation within the company
We balance research and product to deliver Apple quality, state-of-the-art experiences, innovating through the full stack, and partnering with HW, SW, and ML teams to influence the sensor and silicon roadmap that brings our vision to life.
This role offers the opportunity to work on cutting-edge research projects to advance our AI and computer vision capabilities, contributing to both foundational research and practical applications. Join us in pushing the boundaries of what is possible with foundation models, LLMs, and multimodal LLMs!