Microsoft's Azure Machine Learning team is seeking a talented Software Engineer to join their Inference team, focusing on next-generation model serving capabilities. This role is at the cutting edge of AI and Cloud technology, working with OpenAI models like ChatGPT and supporting Bing and Office applications.
The position involves developing and maintaining high-performance, scalable platforms for model inferencing, handling billions of requests daily. You'll be working with state-of-the-art LLMs and Diffusion models, optimizing their performance and cost-effectiveness at scale.
As part of Microsoft's vision to democratize Machine Learning, you'll be contributing to making ML accessible to enterprises, developers, and data scientists worldwide. The role requires expertise in C/C++, Python, and modern cloud technologies like Kubernetes and Docker.
The ideal candidate will have experience with large-scale machine learning model deployment, strong programming skills, and the ability to work effectively in a geo-distributed team environment. You'll be tackling challenging problems at the intersection of AI and cloud computing, working with one of the largest GPU fleets in the world.
This is an excellent opportunity for someone passionate about AI infrastructure who wants to make a significant impact on how machine learning models are served at enterprise scale. You'll be part of a collaborative team environment that values innovation and technical excellence, working on cutting-edge technology that powers Microsoft's AI initiatives.