Join the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. As a Reliability Engineer, you'll be responsible for developing and operating our big data platform using open source solutions to support critical applications in analytics, reporting, and AI/ML. You'll work on optimizing performance and cost, automating operations, and ensuring platform reliability.
The role requires expertise in distributed systems, with a focus on data processing technologies like Apache Spark and data lake solutions. You'll be working with modern cloud infrastructure, managing multi-tenant Kubernetes clusters, and building resilient data pipelines. The ideal candidate has strong programming skills in Java, Python, or similar languages, and experience with incident management and performance optimization.
You'll be joining a dynamic team that values innovation and collaboration, working on solutions that don't yet exist. Your work will directly impact Apple's data infrastructure, supporting various critical applications across the company. This is an opportunity to contribute to high-standard engineering practices while working with cutting-edge technologies in AI and data platforms.
The position offers the chance to work with both Austin and Sunnyvale teams, providing exposure to diverse projects and challenges. You'll be expected to bring your expertise in reliability engineering while continuing to learn and adapt to new technologies and challenges in the rapidly evolving field of data platforms and AI infrastructure.