DeepSeek's New R1 AI Model Lite Version Runs on a Single GPU

2025-05-30

This week, the AI community has been mostly focused on the updated R1 reasoning AI model from DeepSeek. However, the Chinese AI lab also introduced a smaller "lightweight" version of its new R1 model, called DeepSeek-R1-0528-Qwen3-8B. DeepSeek asserts that this variant outperforms other models of similar size in certain benchmark tests.

This compact update to the R1 series is built upon Alibaba's Qwen3-8B model, which was launched in May. It surpasses Google's Gemini 2.5 Flash in AIME 2025, a set of demanding mathematical problems.

DeepSeek-R1-0528-Qwen3-8B is nearly on par with Microsoft’s recently released Phi 4 inference-enhanced model in another math skills test, HMMT.

So-called lightweight models like DeepSeek-R1-0528-Qwen3-8B are generally less powerful than their full-sized counterparts. The advantage, however, is their significantly reduced demand for computational resources. According to cloud platform NodeShift, running Qwen3-8B requires a GPU with 40GB-80GB RAM (such as the Nvidia H100), while the full version of the new R1 needs about twelve 80GB GPUs.

DeepSeek trained DeepSeek-R1-0528-Qwen3-8B by fine-tuning Qwen3-8B using text generated by the updated R1. On the dedicated webpage of AI development platform Hugging Face, DeepSeek describes DeepSeek-R1-0528-Qwen3-8B as “intended for academic research in reasoning models and industrial development focused on small-scale models.”

DeepSeek-R1-0528-Qwen3-8B is available under the permissive MIT license, meaning it can be used for commercial purposes without restrictions. Several hosts, including LM Studio, already offer the model via API.