Boson AI, an innovative startup in the AI space, is seeking a Senior High Performance Computing Engineer to join their team. Founded by renowned experts Alex Smola and Mu Li, the company is at the forefront of developing large language tools and generative AI models for language, audio, and entertainment.
The role offers an exceptional opportunity to work with cutting-edge technology, including NVIDIA H100 and A100 GPUs, managing over 20PB of storage, Terabit networking, and hundreds of computers. You'll be responsible for operating the GPUs, network, and filesystem in the datacenter deployment in Toronto, requiring strong problem-solving skills and an adaptable learning mindset.
As a Senior HPC Engineer, you'll be at the heart of the infrastructure that powers Boson AI's innovative work. The position involves managing high-end GPU clusters, configuring complex networking systems, and maintaining critical infrastructure components like MAAS, Ceph, Slurm, and Kubernetes. You'll need to be comfortable with both software and hardware aspects, as the role involves hands-on configuration and maintenance of physical systems.
The ideal candidate will bring a strong background in high performance computing, experience with data center operations, and proficiency in programming. You'll be working with state-of-the-art technology in a dynamic startup environment, contributing directly to the infrastructure that enables advanced AI development. The role offers competitive compensation and the opportunity to work with leading experts in the field of AI and machine learning.
If you're passionate about high-performance computing, have a strong technical background, and want to be part of a team pushing the boundaries of AI technology, this role presents an exciting opportunity to make a significant impact in a rapidly growing field.