InstanceAssemble Open-Sourced by Xiaohongshu AI NEWS - Find The Best AI Tools & AI News

Home
AInews
InstanceAssemble Open-Sourced by Xiaohongshu

InstanceAssemble Open-Sourced by Xiaohongshu

2025-12-26

Today, Xiaohongshu has partnered with Fudan University to unveil InstanceAssemble, a groundbreaking solution in the field of layout-to-image generation. By introducing an innovative "instance assembly attention" mechanism, this approach enables precise image synthesis from simple to complex and sparse to dense layouts. The research has been accepted by NeurIPS 2025.

In recent years, AI-driven image generation has rapidly evolved—from early text-to-image models toward more advanced layout-controlled methods. These newer techniques generate images based on spatial constraints provided by users, such as bounding boxes, segmentation masks, or skeletal structures.

One major challenge in layout-to-image generation lies in ensuring that AI accurately places objects both semantically and spatially within designated regions. Common issues include misaligned object positioning, semantic inconsistencies, and excessive computational demands.

The newly introduced InstanceAssemble framework, developed jointly by Fudan University and Xiaohongshu, successfully achieves fine-grained control over object placement, marking a significant leap toward "precise composition" in AI-generated imagery.

Built upon the state-of-the-art diffusion transformer architecture, InstanceAssemble introduces a novel "instance assembly attention" module. Users simply input bounding box coordinates and textual descriptions for each object, and the model generates semantically appropriate content at the specified locations. Whether handling scenes with just a few elements or highly cluttered environments, InstanceAssemble maintains high fidelity in both layout alignment and semantic coherence.

Notably, the method adopts a lightweight adaptation strategy that significantly lowers deployment barriers. It requires no full-model retraining—only approximately 71 million additional parameters (about 3.46% extra) are needed to adapt Stable Diffusion 3-Medium, while adaptation to Flux.1 adds merely 0.84% extra parameters.

In evaluations, InstanceAssemble demonstrated superior performance on a densely packed dataset containing 900,000 instances, outperforming existing approaches by a large margin.

To better assess layout-image alignment, the research team also introduced Denselayout, a new benchmark featuring 5,000 images and 90,000 annotated instances, along with a proposed evaluation metric called Layout Grounding Score (LGS).

Experiments show that InstanceAssemble excels across diverse layout conditions. Even when trained exclusively on sparse layouts (≤10 instances), it maintains robust performance on dense layouts (≥10 instances).

The technology is now open-sourced, with code and pre-trained models publicly available on GitHub, offering strong support for applications in design, advertising, and digital content creation.

Vizcom AI

Transform sketches into 3D models and edit them

Keploy

Automated testing made easy with AI technology

Figma Make

Create prototype apps from existing designs

Doctronic

AI platform providing personalized health guidance

3D Look AI

AI body scanner for accurate body measurements

VulnZap

AI code vulnerability scanner

The Furnisher

AI room design tool for quick makeovers

RECENT AI TOOLS

Plaud

Vizcom AI

Keploy

Figma Make

Doctronic

RECENT AI NEWS

UBTECH's 1,000th Humanoid Robot Walker S2 Rolls Off Production Line in Liuzhou

InstanceAssemble Open-Sourced by Xiaohongshu

ChatGPT Now Functions as a Word Processor! OpenAI Launches Rich Text Editor

Coforge Launches EvolveOps.AI, an AI-Powered Agentive IT Operations Platform

AWS and Google Cloud Preview Secure Multi-Cloud Networking

Snowflake Reportedly Plans $1 Billion Acquisition of Startup Observe

Multiple Authors Sue Six Major AI Companies Alleging Copyright Infringement of Books

Qwen Upgrades Image Editing Model for Better Character Consistency

RECENT AI TOOLS