Toyota Financial Services (TFS) is seeking a Lead Site Reliability Engineer to spearhead their cloud platform operations on AWS. This role sits at the intersection of infrastructure management and engineering excellence, focusing on building resilient, self-healing systems that power TFS's critical financial operations. As part of Toyota, one of the world's most admired brands, you'll work in a collaborative environment that values innovation and continuous improvement.
The position requires deep expertise in cloud infrastructure and SRE best practices, with responsibilities spanning from operating cloud-native infrastructure to implementing advanced observability solutions. You'll work with cutting-edge technologies including EKS, Lambda, and CloudWAN, while building automation workflows that enhance system reliability and reduce manual operations.
The ideal candidate brings 7+ years of relevant experience and a strong foundation in SRE principles. You'll be responsible for defining and tracking service level objectives, managing infrastructure as code with Terraform, and participating in incident management processes. The role offers an opportunity to work with a diverse tech stack including Python, Kubernetes, and various AWS services.
TFS offers a comprehensive benefits package including healthcare, 401(k) with company match, vehicle purchase discounts, and professional development opportunities. The company culture emphasizes teamwork, respect, and innovation, making it an ideal environment for those who want to make a significant impact while working with enterprise-scale cloud infrastructure.
This position represents an excellent opportunity for an experienced SRE professional to join a leading financial services organization and help shape the future of their cloud infrastructure while enjoying the stability and benefits of working for a Fortune 500 company.