Site Reliability / GitOps Engineer

Pioneer tech firm publishing Ubuntu, leading open source platform for AI, IoT and cloud computing.
Site Reliability
Remote
Enterprise SaaS

Description For Site Reliability / GitOps Engineer

Canonical, the company behind Ubuntu, is seeking a Site Reliability / GitOps Engineer to join their Information Systems team. This role presents a unique opportunity to impact services used by over 60 million Ubuntu users. You'll be at the forefront of operations automation, working across private and public clouds while utilizing cutting-edge open source infrastructure as code software.

As an SRE & GitOps engineer, you'll be instrumental in driving operations automation to new heights, working with both private and public cloud infrastructures. Your role involves not just defining infrastructure as code, but also contributing to the improvement of Canonical products through critical feedback and collaboration with development teams.

The position offers a fully remote working environment, with Canonical being a remote-first company since 2004. You'll be part of a global team of SREs, working together to provide the best possible services to the company, Canonical's customers, and the Ubuntu Community. The role combines hands-on technical work with collaborative opportunities, allowing you to grow your expertise while working with some of the best people in the industry.

Key responsibilities include developing infrastructure as code practices, automating software operations, maintaining critical services, and handling escalations. You'll work with modern technologies like Prometheus, Grafana, and Elasticsearch, while having the opportunity to contribute to open source projects.

The ideal candidate should have strong experience in IT operations automation, infrastructure as code, and a deep passion for technology, particularly Linux and open source. Benefits include a competitive compensation package, learning and development budget, and various travel opportunities to meet colleagues at 'sprints'.

Last updated 5 days ago

Responsibilities For Site Reliability / GitOps Engineer

  • Develop infrastructure as code practice within IS by increasing automation and improving IaC processes
  • Automate software operations across private and public clouds
  • Develop features and improve resilience and scalability of cloud and container portfolio
  • Maintain operational responsibility for Canonical's core services, networks, and infrastructure
  • Set up and maintain observability tools like Prometheus, Grafana, and Elasticsearch
  • Collaborate with development teams on service architecture and documentation
  • Provide assistance to globally distributed engineering teams
  • Handle time-critical escalations
  • Share experience and best practices with team members

Requirements For Site Reliability / GitOps Engineer

Python
Linux
Kubernetes
  • Deep experience in defining operations in code, using version control, peer review and CI/CD
  • Strong modern engineering background (peer-review, unit testing, SCM, CI/CD, Agile)
  • Python software development experience with large projects
  • Practical knowledge of Linux networking, routing, and firewalls
  • Hands-on experience administering enterprise Linux servers
  • Extensive knowledge of cloud computing concepts and technologies
  • Bachelor's degree or greater, preferably in computer science or related engineering field
  • Strong English communication skills
  • Experience with Linux storage, from Ceph to Databases
  • Passion for open-source, especially Ubuntu or Debian

Benefits For Site Reliability / GitOps Engineer

Education Budget
Parental Leave
  • Personal learning and development budget of 2,000USD per annum
  • Annual compensation review
  • Recognition rewards
  • Annual holiday leave
  • Parental Leave
  • Employee Assistance Programme
  • Travel opportunities for team meetings
  • Priority Pass for travel and travel upgrades
  • Fully remote working environment

Interested in this job?