Duffel is revolutionizing the travel industry by building modern infrastructure to simplify travel distribution, search, and booking. As a Site Reliability Engineer, you'll join a dynamic team backed by prestigious investors like Benchmark and Index Ventures. You'll be responsible for maintaining and improving the reliability, performance, and resilience of Duffel's infrastructure and applications.
The role involves working with cutting-edge technologies including GCP, Kubernetes, and OpenTelemetry. You'll be handling critical infrastructure components, managing high-availability metrics collection systems, and overseeing data pipelines. The team is currently focused on improving reliability monitoring and implementing OpenTelemetry with Honeycomb for better production insights.
Future challenges include expanding to multiple regions globally and improving deployment strategies. You'll be working in a collaborative environment using tools like Elixir, Phoenix, and various GCP services. The position offers significant technical challenges, from debugging complex configuration issues to architecting multi-regional infrastructure.
As part of the team, you'll contribute to building tools that will make the future of travel effortless, serving over 4 billion airline passengers. The company offers equity ownership and is committed to personal growth, maintaining an inclusive environment that values diverse perspectives and problem-solving abilities.