Alibaba has introduced Qwen3, a new series of large language models (LLMs) that the company considers a significant milestone on the path to artificial general intelligence (AGI) and artificial superintelligence (ASI). These models introduce hybrid reasoning and support over 100 languages, marking a major leap forward in multilingual AI capabilities.
The series consists of eight models, all released globally as open source. By dynamically switching between "thinking" and "non-thinking" modes, Qwen3 is poised to compete with some of the best-performing AI systems today.
Exploring Qwen3
Qwen3 models aim to advance hybrid reasoning, multilingual support, and agent capabilities. The series includes six dense models and two Mixture of Experts (MoE) models, with parameters ranging from 60 million to 235 billion. Beyond its impressive scale, what are the core features of Qwen3?
Hybrid Reasoning and Mode Switching
Qwen3 employs a dual-mode system that allows users to switch between "thinking mode" for complex reasoning and coding tasks, and "non-thinking mode" for rapid responses and general conversations. This flexibility enables users to optimize for depth or speed depending on the task, ensuring efficient use of computational resources.
Advanced Agent Capabilities
These AI models demonstrate advanced agent abilities, seamlessly integrating external tools in both thinking and non-thinking modes. Qwen3 can execute complex tool-augmented tasks with precision, establishing itself as one of the most powerful open-source models for agent applications.
Extensive Multilingual Support
Qwen3 supports 119 languages and dialects, aiming for global accessibility. Its robust multilingual capabilities allow it to perform high-quality instruction following and translation across various linguistic contexts.
Top-Tier Benchmark Performance
The flagship model, Qwen3-235B-A22B, excels in industry benchmarks, outperforming OpenAI's o1 and DeepSeek's -R1 in coding, mathematics, and general reasoning. It also surpasses OpenAI's o3-mini and Google's Gemini 2.5 Pro on platforms like Codeforces.
Vast and Diverse Training Data
Trained on over 36 trillion tokens—including textbooks, question-answer pairs, code, and synthetic data—Qwen3's extensive training set underpins its strong reasoning and instruction-following performance.
Open-Source Accessibility
All Qwen3 models are released under the Apache 2.0 license and are freely available for use and integration on platforms like Hugging Face, ModelScope, Kaggle, and GitHub. This encourages widespread adoption and community-driven development.
Expanded Context Handling
Qwen3-8B, one of the models in the series, features 8.2 billion parameters distributed across 36 layers. It can process up to 32,768 input tokens at once, enabling it to handle tasks requiring extensive context, such as document summarization or multi-step conversations.
Qwen3 Ushers in a New Era of Accessible AI Innovation
With advanced features like hybrid reasoning, MoE architecture, and extensive multilingual support, coupled with cost-effective scalability, Qwen3 opens up new possibilities for users and businesses alike. Companies can now deploy powerful AI models tailored to their specific needs and budgets, while users benefit from smarter, more context-aware tools and services.
As Alibaba pursues AGI as its core mission, Qwen3 stands as a testament to this ambition. It represents a new standard for accessible, high-performance AI capable of transforming industries.