Alibaba's New Qwen Model Clones Voice from Three-Second Audio Clip AI NEWS

Home
AInews
Alibaba's New Qwen Model Clones Voice from Three-Second Audio Clip

Alibaba's New Qwen Model Clones Voice from Three-Second Audio Clip

2025-12-24

Alibaba Cloud's Tongyi Qianwen team has unveiled two new AI voice models capable of generating or cloning speech from text instructions. The first, Qwen 3-TTS-VD-Flash, enables users to create highly customized voices based on detailed descriptions, precisely controlling characteristics such as emotion and speaking speed. For instance, users can generate a "loud baritone from a middle-aged man — energetic, fast-paced TV shopping-style speech with exaggerated intonation and strong sales appeal." According to official reports, this model outperforms OpenAI’s GPT-4o mini-tts interface introduced earlier this spring.

The second model, Qwen 3-TTS-VC-Flash, requires only a three-second audio sample to clone a voice and can reproduce the vocal characteristics in ten different languages. The Tongyi Qianwen team claims that this model achieves a lower error rate compared to competing solutions like ElevenLabs and MiniMax. Additionally, it handles complex texts, simulates animal sounds, and extracts target voices from recordings. Both models are accessible via Alibaba Cloud's API, and demonstration versions are also available for testing on the Hugging Face platform.

Vizcom AI

Transform sketches into 3D models and edit them

Keploy

Automated testing made easy with AI technology

Figma Make

Create prototype apps from existing designs

Doctronic

AI platform providing personalized health guidance

3D Look AI

AI body scanner for accurate body measurements

VulnZap

AI code vulnerability scanner

The Furnisher

AI room design tool for quick makeovers

RECENT AI TOOLS

Plaud

Vizcom AI

Keploy

Figma Make

Doctronic

RECENT AI NEWS

Docker Offers Free Hardened Images for Container Security Transformation

Likesmile AI Releases Open-Source Text-to-Speech Model Chatterbox Turbo, Capable of Voice Cloning in Five Seconds

Meta Brings Segment Anything to Audio: Editors Can Extract Sounds from Videos with a Click or Text Prompt

OpenAI is Seeking a New Emergency Response Lead

Coforge's $2.35B Acquisition of Encora's Tech Services

UBTECH's 1,000th Humanoid Robot Walker S2 Rolls Off Production Line in Liuzhou

InstanceAssemble Open-Sourced by Xiaohongshu

ChatGPT Now Functions as a Word Processor! OpenAI Launches Rich Text Editor

RECENT AI TOOLS