On January 26, 2025, Alibaba officially launched the updated version v0.2 of its AI platform Qwen Chat and simultaneously released the latest open-source language model series Qwen2.5-1M. This update represents a significant advancement in Alibaba's development of multimodal AI tools.
This new version of Qwen Chat integrates three primary features: web search, video creation, and image generation. Users can now perform real-time web searches directly within the chat interface, create videos based on prompts, and generate high-quality images from text descriptions. These new capabilities complement existing functions such as document analysis, item creation, and image understanding, making Qwen Chat a versatile tool suitable for both professional tasks and creative projects.
In addition, Alibaba has introduced the Qwen2.5-1M series of open-source language models, including Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M. Both models support processing up to one million tokens in a single context, providing a distinct advantage when handling large-scale text input tasks like document summarization, code analysis, and long-form content generation. This is a significant leap compared to most existing language models, which typically handle token limits within tens of thousands.
To align with the launch of these new models, Alibaba has also made public an advanced inference framework based on vLLM (a high-performance serving system for large language models). The framework incorporates sparse attention techniques, accelerating the speed of processing one million token inputs by three to seven times faster than traditional methods. Sparse attention optimizes how models focus on relevant parts of the input text, reducing computational costs while maintaining accuracy.
The release also includes a technical report and blog post detailing the architecture and performance of the Qwen2.5-1M series. Users can explore these models across multiple platforms, including real-time interaction via Alibaba's Qwen Chat, experimentation through Hugging Face, and additional deployment options via Modelscope.
Qwen Chat is part of Alibaba Cloud's broader AI initiative aimed at creating user-friendly and high-performance tools. The Qwen model family includes various specialized models, such as Qwen2.5-Coder for programming and Qwen2-VL-Max for visual-language tasks. These models are renowned for their robust multilingual support and extended context length capabilities, with some capable of handling up to 128K tokens.
Users can access these new features via toggle options within the chat interface. Web search results will be seamlessly integrated into conversations, while text-to-video and image generation tools allow users to create media outputs by entering prompts directly. Existing features, such as document uploads and item creation, remain accessible within the same interface.