Microsoft is seeking a Senior Software Engineer to join their Search Ads Understanding team, focusing on GPU inference optimization for large language models (LLMs) and small language models (SLMs). This role is crucial for supporting GPU serving of models for various Ads tasks including query rewrite, Ad relevance, and Ad creative generation.
The position offers an exciting opportunity to work on fundamental abstractions, programming models, runtimes, libraries, and APIs to enable large-scale inferencing and online serving of models on novel AI hardware. The role requires extensive hands-on software development skills and experience with GPU optimization.
As part of Microsoft's mission to empower every person and organization globally, you'll be working in a collaborative environment with researchers and developers, tackling complex technical challenges in building a full end-to-end AI stack. The role demands an entrepreneurial approach and the ability to take initiative in a fast-paced environment.
The position offers a hybrid work arrangement with up to 50% work from home flexibility and requires 0-25% travel. You'll be part of Microsoft's inclusive work environment that values growth mindset, innovation, and collaboration. The company provides comprehensive benefits including industry-leading healthcare, educational resources, savings and investments opportunities, parental leave, and various other perks.
This is an excellent opportunity for experienced software engineers passionate about GPU optimization, machine learning, and working with cutting-edge technology at scale. The role combines technical depth with business impact, as your work will directly influence the performance and efficiency of Microsoft's advertising systems.