Microsoft's Mu Brings Natural Language Chat to Windows 11 Settings Menu

2025-06-24

Microsoft's latest compact language model is specifically engineered for device-side processing, primarily integrated into the Windows 11 Settings application.

Mu serves as the AI agent's core technology in system configuration menus, enabling users to interact through natural language queries. With proper authorization, this intelligent agent can autonomously execute system adjustments while maintaining comprehensive understanding of hundreds of operating parameters.

Current Mu implementations are available in limited Windows Insider Preview builds.

Technical Architecture: Balancing Performance with Hardware Constraints

According to June 23rd disclosures, Microsoft detailed Mu's operational framework. The model initially underwent training on NVIDIA A100 GPUs within Azure's machine learning infrastructure before being optimized for PC neural processing units (NPUs). Current implementations achieve over 100 tokens per second processing rates on these specialized hardware components.

Building upon prior Phi Silica project experiences, Mu represents Microsoft's focused approach to on-device language processing. This specific iteration was developed for Snapdragon X-series notebooks compatible with Windows 11 Copilot+ devices in 2024.

The company emphasizes that encoder-decoder architecture outperforms pure decoder models in efficiency metrics.

"By separating input and output token processing, Mu's encoding strategy significantly reduces computational and memory demands," noted Vivek Pradeep, Microsoft's VP of Windows Applications Science and Distinguished Engineer in a blog post. "This design choice delivers reduced latency and enhanced throughput on dedicated hardware."

Microsoft confirms that encoder-decoder models demonstrate superior efficiency compared to decoder-only architectures.

NPU Optimization Strategies for Copilot+ Devices

Through extensive NPU integration work, Microsoft engineers refined Mu's architectural adaptations. Key optimizations include aligning model parameters with hardware parallelism and memory limitations, redistributing encoder-decoder components for balanced performance, and implementing other efficiency-enhancing techniques.

Shared weight implementations for input/output token processing substantially reduce parameter counts, a critical factor for maintaining performance on memory-constrained NPUs.

Mu proactively avoids operations that would be either unsupported or inefficient on NPU hardware.

Architectural refinements to transformer structures combined with model quantization techniques further improve NPU power efficiency.

The AI assistant in Windows 11 Settings is currently accessible in Insider Preview builds through the Dev Channel. Presently limited to Snapdragon-based Copilot+ PCs, Microsoft has indicated AMD/Intel platforms will gain compatibility in future updates.