Cerebrium (https://www.cerebrium.ai) is a serverless infrastructure platform that makes it easy for companies to build and scale data/AI workloads. With just a team of 4 across the US and South Africa, we scaled to doing millions in revenue, serving some of the most ambitious AI teams from Seed to Series C and are backed by world class investors like Gradient Ventures (Google's AI fund) and Y Combinator with over $8m in funding.
As a Systems Engineer at Cerebrium, you'll be responsible for the core infrastructure that powers 1000's of CPU/GPUs workloads. You'll work on low-level systems like custom file systems, container runtimes, and deployment pipelines, while owning critical components across compute, storage, and networking.
This role demands deep technical expertise in areas like containerization, distributed systems, infrastructure as code (e.g. Terraform), observability, and multi-cloud environments (AWS, GCP, etc). You should be reliability-obsessed, performance-driven, enjoy solving hard technical problems and can take full ownership of a task—from initial discovery and technical validation, through implementation and release.
We obsess over performance - providing low latency, high scalability, and reliability to our clients. This has led us to engineering core components across our stack from the ground up, including our own content-aware file system, custom image building pipeline, as well as optimizing our storage and network layers. Every decision we make is driven by the experience we want to unlock for developers: fast, reliable, and intuitive.
You'll work alongside a team that has founded and exited companies, led engineering teams of 80+ engineers, and built distributed systems at scale across multiple industries. You'll learn a lot—but you'll also be expected to take ownership, move quickly, and directly influence the direction of our product.
Our work culture emphasizes output over hours worked. We maintain a flat structure and welcome challenges to our thinking. We ship multiple times a week, adding value to customers continuously. The position requires 3 days per week in our Manhattan office.
Benefits include competitive salary and meaningful equity, comprehensive health coverage (80%), unlimited PTO, regular company off-sites (previous locations include Rio, Budapest, Tulum and Athens), and a learning budget. Join us in building the future of AI infrastructure!