As a Senior Site Reliability Engineer in the Block Storage team at Oracle, you will be responsible for leading and mentoring the team, driving projects end-to-end, and ensuring the reliability and performance of our services. Your role will involve monitoring services, debugging operational issues, and working with internal and external teams to diagnose performance problems. You'll be tasked with automating build and test systems, improving deployment processes across multiple regions, and participating in on-call rotations to resolve complex distributed issues. Your expertise will be crucial in developing runbooks, alarms, and tools that enable customers to self-diagnose problems. Additionally, you'll play a key role in deploying services to new regions and automating this process.
The ideal candidate should have 5+ years of software development or automation experience in a Linux-based environment, with strong skills in Python and shell scripting. Proficiency with Linux-based build tools, CI/CD environments, and networking protocols is essential. Familiarity with docker containers, databases, and distributed storage technologies is highly valued. You should possess excellent troubleshooting and performance tuning skills, and have a bachelor's degree in computer science, engineering, or a related field.
At Oracle, you'll be part of a world-leading cloud solutions provider that values innovation, diversity, and work-life balance. The company offers competitive benefits and global career opportunities. Join Oracle to work on cutting-edge technology and contribute to solving today's most challenging problems in a supportive and inclusive environment.