Friday, July 18, 2025

Microsoft Unveils ‘Mu’: Lightweight AI Model for Fast On-Device Assistance

Microsoft unveils AI model Mu for fast on device help on Copilot PCs

Microsoft has introduced a compact AI model named Mu, designed to run locally on Copilot+ PCs using the device’s Neural Processing Unit (NPU). Unlike cloud-dependent models, Mu offers real-time, low-latency assistance directly on the device, starting with integration into the Windows Settings app for users in the Windows Insider Dev Channel. Users can interact with it using natural language queries like “turn on night light,” and get instant responses.

What Sets Mu Apart?
Mu is built with 330 million parameters and follows a Transformer encoder-decoder architecture, enabling it to separately process inputs and outputs. This design boosts speed and responsiveness, generating over 100 tokens per second—significantly faster than other models in its class.

The model was trained using Azure A100 GPUs and refined with advanced AI techniques like grouped-query attention and rotary positional embeddings, which help maintain performance on constrained hardware.

To ensure broad compatibility, Microsoft collaborated with Intel, AMD, and Qualcomm, optimizing Mu through post-training quantization for efficient use of 8-bit and 16-bit integer formats. On the Surface Laptop 7, Mu clocks over 200 tokens per second, outperforming Microsoft’s earlier, slower prototype, Phi LoRA, while maintaining high accuracy.

Trained for Real-World Use
Mu’s training dataset was expanded to 3.6 million samples and now supports hundreds of Windows settings. Innovations like prompt tuning and noise injection enhance its ability to understand diverse, real-world queries.

Part of a Bigger Vision
Mu represents Microsoft’s strategic shift to on-device AI, building upon earlier research from models like Phi and Phi Silica. As Copilot+ PCs become more prevalent, Mu is expected to play a central role in enabling fast, secure, and intelligent user experiences without relying on the cloud.