Meta Brings Segment Anything to Audio: Editors Can Extract Sounds from Videos with a Click or Text Prompt AI NEWS

Home
AInews
Meta Brings Segment Anything to Audio: Editors Can Extract Sounds from Videos with a Click or Text Prompt

Meta Brings Segment Anything to Audio: Editors Can Extract Sounds from Videos with a Click or Text Prompt

2025-12-29

Meta is extending its "Segment Anything" approach into the audio domain, with a focus on echoes and 3D modeling. The new AI model, SAM Audio, can isolate individual sound sources from complex audio mixtures using text commands, temporal markers, or visual clicks.

According to Meta, this system is the first unified model capable of handling such tasks across diverse input modalities. Instead of relying on separate tools for different use cases, it flexibly responds to any type of user command.

The system offers three interchangeable control methods. Users can enter text prompts—such as “barking dog” or “singing voice”—to extract specific sounds. They can click directly on objects or people in a video to capture corresponding audio. Alternatively, they can use time-based markers, known as span prompts, to identify segments where the target sound occurs.

Potential applications span music production, podcasting, and film editing—such as removing traffic noise from outdoor footage or isolating instruments within a recording.

Vizcom AI

Transform sketches into 3D models and edit them

Keploy

Automated testing made easy with AI technology

Figma Make

Create prototype apps from existing designs

Doctronic

AI platform providing personalized health guidance

3D Look AI

AI body scanner for accurate body measurements

VulnZap

AI code vulnerability scanner

The Furnisher

AI room design tool for quick makeovers

RECENT AI TOOLS

Plaud

Vizcom AI

Keploy

Figma Make

Doctronic

RECENT AI NEWS

Docker Offers Free Hardened Images for Container Security Transformation

Likesmile AI Releases Open-Source Text-to-Speech Model Chatterbox Turbo, Capable of Voice Cloning in Five Seconds

Meta Brings Segment Anything to Audio: Editors Can Extract Sounds from Videos with a Click or Text Prompt

OpenAI is Seeking a New Emergency Response Lead

Coforge's $2.35B Acquisition of Encora's Tech Services

UBTECH's 1,000th Humanoid Robot Walker S2 Rolls Off Production Line in Liuzhou

InstanceAssemble Open-Sourced by Xiaohongshu

ChatGPT Now Functions as a Word Processor! OpenAI Launches Rich Text Editor

RECENT AI TOOLS