Taro Logo

Sr. Software Engineer--GPU Inference Optimization

Microsoft is a global technology company that empowers every person and organization on the planet to achieve more.
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Sr. Software Engineer--GPU Inference Optimization

Microsoft is seeking a Senior Software Engineer to join their Search Ads Understanding team, focusing on GPU inference optimization for large language models (LLMs) and small language models (SLMs). This role is crucial for supporting GPU serving of models for various Ads tasks including query rewrite, Ad relevance, and Ad creative generation.

The position offers an exciting opportunity to work on fundamental abstractions, programming models, runtimes, libraries, and APIs to enable large-scale inferencing and online serving of models on novel AI hardware. The role requires extensive hands-on software development skills and experience with GPU optimization.

As part of Microsoft's mission to empower every person and organization globally, you'll be working in a collaborative environment with researchers and developers, tackling complex technical challenges in building a full end-to-end AI stack. The role demands an entrepreneurial approach and the ability to take initiative in a fast-paced environment.

The position offers a hybrid work arrangement with up to 50% work from home flexibility and requires 0-25% travel. You'll be part of Microsoft's inclusive work environment that values growth mindset, innovation, and collaboration. The company provides comprehensive benefits including industry-leading healthcare, educational resources, savings and investments opportunities, parental leave, and various other perks.

This is an excellent opportunity for experienced software engineers passionate about GPU optimization, machine learning, and working with cutting-edge technology at scale. The role combines technical depth with business impact, as your work will directly influence the performance and efficiency of Microsoft's advertising systems.

Last updated 2 months ago

Responsibilities For Sr. Software Engineer--GPU Inference Optimization

  • Software development in C/C++, Python, and in GPU languages such as CUDA, ROCm, or Triton
  • Work with cutting-edge hardware stacks and a fast-moving software stack to deliver best-of-class inference and optimal cost
  • Engage with key partners to understand and implement inference and training optimization for state-of-the-art LLMs and other models

Requirements For Sr. Software Engineer--GPU Inference Optimization

Python
  • Bachelor's degree in computer science or related technical field AND 4+ years technical engineering experience
  • 3+ years practical experience working on applications that use GPUs, experience in optimizing their performance
  • Practical Experience writing new GPU kernels
  • Cross-team collaboration skills
  • Experience in low-level performance analysis and optimization
  • Technical background and solid foundation in software engineering principles and architecture design
  • Experience with deep learning frameworks such as PyTorch, Tensorflow, or ONNX Runtime

Benefits For Sr. Software Engineer--GPU Inference Optimization

Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
Education Budget
401k
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?