Google Launches New LiteRT Accelerator to Speed Up AI Workloads on Snapdragon Android Devices AI NEWS

Home
AInews
Google Launches New LiteRT Accelerator to Speed Up AI Workloads on Snapdragon Android Devices

Google Launches New LiteRT Accelerator to Speed Up AI Workloads on Snapdragon Android Devices

2025-12-01

Google has introduced a new accelerator for LiteRT called Qualcomm AI Engine Direct (QNN), designed to enhance AI performance on Qualcomm-powered Android devices featuring Snapdragon 8 SoCs. This accelerator delivers dramatic speedups—up to 100x faster than CPU execution and up to 10x faster than GPU processing.

Although modern Android devices commonly include GPU hardware, Google software engineers Lu Wang, Wiyi Wanf, and Andrew Wang note that relying solely on GPUs for AI workloads can create performance bottlenecks. For instance, they explain that running a compute-intensive text-to-image generation model alongside real-time camera processing using ML-based segmentation can overwhelm even high-end mobile GPUs, leading to stuttering and frame drops that degrade user experience.

Fortunately, many contemporary mobile devices now integrate Neural Processing Units (NPUs)—dedicated AI accelerators that significantly outperform GPUs on AI tasks while consuming far less power.

Developed in close collaboration with Qualcomm, QNN replaces the previous TFLite QNN delegate. It offers a unified and streamlined workflow by integrating a broad set of SoC compilers and runtimes through a simplified API for developers. Supporting 90 LiteRT operations, QNN aims to enable full-model delegation, a critical factor for achieving peak performance. The solution also includes specialized kernels and optimizations that further accelerate large language models like Gemma and FastVLM.

Google benchmarked QNN across 72 machine learning models, with 64 successfully achieving full NPU delegation. Results showed performance gains of up to 100x over CPU and up to 10x over GPU execution.

On Qualcomm’s latest flagship SoC, the Snapdragon 8 Elite Gen 5, the improvements are especially striking: more than 56 models ran in under 5 milliseconds on the NPU, compared to only 13 models achieving that speed on the CPU. This breakthrough unlocks real-time AI experiences previously unattainable on mobile devices.

Google engineers also built a proof-of-concept application leveraging an optimized version of Apple’s FastVLM-0.5B vision encoder model. The app can interpret live camera scenes nearly instantaneously. On the Snapdragon 8 Elite Gen 5 NPU, it achieves a Time-to-First-Token (TTFT) of just 0.12 seconds on 1024×1024 images, with prefill speeds exceeding 11,000 tokens per second and decoding throughput surpassing 100 tokens per second. The model was optimized using int8 weight quantization and int16 activation quantization—a key enabler for tapping into the NPU’s most powerful and high-speed int16 kernels, according to Google engineers.

QNN currently supports a limited subset of Android hardware, primarily devices powered by Snapdragon 8 and Snapdragon 8+ SoCs. Developers interested in getting started can refer to the NPU acceleration guide and download LiteRT from GitHub.

Vizcom AI

Transform sketches into 3D models and edit them

Keploy

Automated testing made easy with AI technology

Figma Make

Create prototype apps from existing designs

Doctronic

AI platform providing personalized health guidance

3D Look AI

AI body scanner for accurate body measurements

VulnZap

AI code vulnerability scanner

The Furnisher

AI room design tool for quick makeovers

RECENT AI TOOLS

Plaud

Vizcom AI

Keploy

Figma Make

Doctronic

RECENT AI NEWS

New Deepseek Technique Balances Signal Flow and Learning Capability in Large AI Models

Lightricks Open-Sources AI Video Model LTX-2 to Challenge Sora and Veo

Motional Puts AI at Core of Robotaxi Revival, Targeting 2026 Launch

Google Announces New Agreement to Drive Business Activities with AI Agents

Google Removes AI Overviews for Certain Medical Queries

Musk to Launch xAI's First AI-Powered Coding Tool, Grok Build, Next Month

RECENT AI TOOLS