Midjourney Launches "Patchwork" Tool to Expand AI Image Creation Boundaries

2024-12-12

Recently, Midjourney, a popular AI image generation startup with over 21 million Discord server users, officially launched its latest tool—"Patchwork." This tool aims to expand the boundaries of AI image creation and editing, providing users with a fresh creative experience.

Max Kreminski, head of Midjourney's Storytelling Lab, showcased Patchwork's features during a live session on Discord and X platforms. He explained that Patchwork will operate as a standalone application, requiring users to log into their Midjourney accounts to access it. The application's "research preview" URL is available in the "Updates" channel of the Midjourney Discord server, and users must link their Midjourney Discord accounts with their Google accounts to gain access.

Patchwork is a web-based infinite canvas with a blank white background. On the left side, there is a "Toolbox" containing buttons such as "Characters," "Events," "Factions," "Locations," "Props," and "Random," as well as tools like "Notes," "Images," "Portals," "Save," and "Share." Users can utilize these tools to create and manage various digital "worlds" on the canvas.

Each world consists of an individual canvas, and users can switch between different worlds by creating "Portals." To generate a new world, users need to enter text prompts in the editor bar at the top of the "Create" screen and select one or more from ten different image styles.

This will create a new white canvas featuring a series of new static image assets and text boxes or entities, referred to as "Fragments." These fragments include input boxes that allow users to generate new images or settings based on the initial world description, and even entirely new AI-generated character descriptions.

During the live demonstration, the character name was automatically filled as Marcus "Dizzy" Gillespie, sharing the name with the renowned jazz musician. Dragging the description into the new character image creation box generated four new AI-generated images. Users can also add new character boxes and prompt the creation of names, traits, and motivations, which can incite conflicts and lay the foundation for the story.

Users can connect characters with lines to indicate their relationships. Additionally, they can write action sequences and scene descriptions, each narrating a story. Each character can be used across multiple images, and these images can be grouped together through an option.

Users can "Share" their canvases with other Midjourney users for collaboration. According to Kreminski, up to 100 users can collaborate in real-time within the same world, although the experience may become increasingly chaotic as the number of users grows.

Kreminski also revealed that currently only logged-in users can view canvases, but non-users may be able to access them in the future. He mentioned that desktop role-playing game groups have already started using this feature to plan their games. Additionally, Midjourney version 7 (V7) will include a setting that allows maintaining consistency of multiple characters across different and new images.

Patchwork is powered by at least three different large language models, including an open-source model exclusively optimized for Midjourney. Looking ahead, Kreminski stated that there is a "very clear path to enhancing the details and interactions within the worlds," including achieving fully immersive 3D virtual reality scenes, though this may take several years.

Meanwhile, other AI researchers, startups such as Fei-Fei Li's World Labs, and major tech companies like Google are also seeking to develop AI capable of creating 3D immersive, navigable worlds from simple prompts or images.

Additionally, Midjourney founder David Holz announced during the live stream that the company will introduce multiple model personalization modes in the coming days. Currently, Midjourney allows users to personalize the type of visuals they wish to see in generations by rating images and fine-tuning the models to suit their individual preferences. Now, the company will allow users to have multiple personalized versions and switch between them.

Holz also revealed that Midjourney will allow users to upload and reference multiple images to the canvas to guide generation. Moreover, sometime after Christmas (December 25), Midjourney will launch video models and the Midjourney V7 AI image generator, the latter equipped with enhanced prompt understanding capabilities.

Holz further stated that Midjourney is developing three to four new hardware projects and is aiming to expand the company into a comprehensive research laboratory. He anticipates that it may take around six months to announce all these new projects.