Taro Logo

Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft delivers cloud infrastructure and foundational technologies for Microsoft's cloud businesses including Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.
$119,800 - $234,700
Machine Learning
Staff Software Engineer
Hybrid
5,000+ Employees
4+ years of experience
AI · Enterprise SaaS

Description For Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft's Strategic Planning and Architecture (SPARC) team within Azure Hardware Systems and Infrastructure (AHSI) is seeking a Senior ML Research Engineer specializing in LLM Quantization & Model Optimization. This role combines cutting-edge research in machine learning with practical implementation in Microsoft's cloud infrastructure.

The position offers a unique opportunity to work at the intersection of large language models and hardware optimization, developing novel quantization techniques and optimization strategies for LLM deployment. You'll be part of the team that powers Microsoft's expanding cloud infrastructure, supporting services like Azure, Bing, Office 365, and Xbox Live.

As a Senior ML Research Engineer, you'll lead efforts in designing and implementing state-of-the-art model optimization techniques, working closely with cross-functional teams to improve the efficiency and performance of large language models. The role requires deep expertise in model quantization, optimization, and a strong understanding of Transformer architectures.

The position offers competitive compensation ($119,800 - $234,700 USD), comprehensive benefits, and the opportunity to work in a hybrid environment with up to 50% work from home flexibility. You'll be part of Microsoft's mission to empower every person and organization on the planet to achieve more, working with cutting-edge technology and collaborating with leading researchers and engineers.

This role is perfect for someone who combines strong theoretical knowledge with practical engineering skills, has a track record of research publications, and wants to impact the future of AI infrastructure at scale. You'll have the opportunity to influence the direction of LLM optimization at Microsoft while working with some of the most advanced AI systems in the industry.

The position requires a doctorate or equivalent experience, with at least 4 years of combined experience including 2+ years in industry focusing on low-precision model optimization. You'll be working in a collaborative environment that values innovation, technical excellence, and cross-team collaboration.

Last updated 2 days ago

Responsibilities For Senior ML Research Engineer – LLM Quantization & Model Optimization

  • Design and develop novel quantization techniques to enable efficient deployment of LLM inference and training
  • Drive software development and model optimization tooling proof-of-concept effort
  • Analyze performance bottlenecks in state-of-the-art LLM architectures
  • Prototype and evaluate emerging low-precision data formats
  • Co-design model architecture optimized for low-precision deployment
  • Work cross-functionally with data scientists and ML researchers/engineers
  • Partner with hardware architecture and AI software framework teams

Requirements For Senior ML Research Engineer – LLM Quantization & Model Optimization

Python
  • Doctorate in relevant field OR equivalent experience
  • 4+ years of combined experience, including 2+ years of industry experience in low-precision model optimization and quantization for LLM workloads
  • Experience publishing academic papers as a lead author or essential contributor
  • Proficient with deep learning frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime
  • Programming skills in Python, C, and C++
  • Excellent communication skills and a team-oriented mindset

Benefits For Senior ML Research Engineer – LLM Quantization & Model Optimization

Medical Insurance
Parental Leave
Education Budget
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?

Jobs Related To Microsoft Senior ML Research Engineer – LLM Quantization & Model Optimization