Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft

Microsoft delivers cloud infrastructure and foundational technologies for Microsoft's cloud businesses including Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.

Mountain View, CA, USA

$119,800 - $234,700

Machine Learning

Staff Software Engineer

Hybrid

5,000+ Employees

4+ years of experience

AI · Enterprise SaaS

Description For Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft's Strategic Planning and Architecture (SPARC) team within Azure Hardware Systems and Infrastructure (AHSI) is seeking a Senior ML Research Engineer specializing in LLM Quantization & Model Optimization. This role combines cutting-edge research in machine learning with practical implementation in Microsoft's cloud infrastructure.

The position offers a unique opportunity to work at the intersection of large language models and hardware optimization, developing novel quantization techniques and optimization strategies for LLM deployment. You'll be part of the team that powers Microsoft's expanding cloud infrastructure, supporting services like Azure, Bing, Office 365, and Xbox Live.

As a Senior ML Research Engineer, you'll lead efforts in designing and implementing state-of-the-art model optimization techniques, working closely with cross-functional teams to improve the efficiency and performance of large language models. The role requires deep expertise in model quantization, optimization, and a strong understanding of Transformer architectures.

The position offers competitive compensation ($119,800 - $234,700 USD), comprehensive benefits, and the opportunity to work in a hybrid environment with up to 50% work from home flexibility. You'll be part of Microsoft's mission to empower every person and organization on the planet to achieve more, working with cutting-edge technology and collaborating with leading researchers and engineers.

This role is perfect for someone who combines strong theoretical knowledge with practical engineering skills, has a track record of research publications, and wants to impact the future of AI infrastructure at scale. You'll have the opportunity to influence the direction of LLM optimization at Microsoft while working with some of the most advanced AI systems in the industry.

The position requires a doctorate or equivalent experience, with at least 4 years of combined experience including 2+ years in industry focusing on low-precision model optimization. You'll be working in a collaborative environment that values innovation, technical excellence, and cross-team collaboration.

Last updated 2 days ago

Responsibilities For Senior ML Research Engineer – LLM Quantization & Model Optimization

Design and develop novel quantization techniques to enable efficient deployment of LLM inference and training
Drive software development and model optimization tooling proof-of-concept effort
Analyze performance bottlenecks in state-of-the-art LLM architectures
Prototype and evaluate emerging low-precision data formats
Co-design model architecture optimized for low-precision deployment
Work cross-functionally with data scientists and ML researchers/engineers
Partner with hardware architecture and AI software framework teams

Requirements For Senior ML Research Engineer – LLM Quantization & Model Optimization

Python

Doctorate in relevant field OR equivalent experience
4+ years of combined experience, including 2+ years of industry experience in low-precision model optimization and quantization for LLM workloads
Experience publishing academic papers as a lead author or essential contributor
Proficient with deep learning frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime
Programming skills in Python, C, and C++
Excellent communication skills and a team-oriented mindset

Benefits For Senior ML Research Engineer – LLM Quantization & Model Optimization

Medical Insurance

Parental Leave

Education Budget

Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Microsoft

Microsoft delivers cloud infrastructure and foundational technologies for Microsoft's cloud businesses including Azure, Bing, MSN, Office 365, OneDrive, Skype, Teams and Xbox Live.

Mountain View, CA, USA

$119,800 - $234,700

Machine Learning

Staff Software Engineer

Hybrid

5,000+ Employees

4+ years of experience

AI · Enterprise SaaS

Interested in this job?

Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft

Description For Senior ML Research Engineer – LLM Quantization & Model Optimization

Responsibilities For Senior ML Research Engineer – LLM Quantization & Model Optimization

Requirements For Senior ML Research Engineer – LLM Quantization & Model Optimization

Benefits For Senior ML Research Engineer – LLM Quantization & Model Optimization

Microsoft

Jobs Related To Microsoft Senior ML Research Engineer – LLM Quantization & Model Optimization