ByteDance Launches Dobao Visual Understanding Model, Significantly Reducing Application Costs

2024-12-19

At the grandly held Volcano Engine Force Conference yesterday, ByteDance unveiled its latest Doupao Visual Understanding Model, aimed at providing powerful and cost-effective multimodal large model capabilities for businesses. It is reported that this model demonstrates remarkable efficiency in processing large volumes of data, with a cost of just 3 cents per thousand tokens. This means that businesses can process up to 284 images at 720P resolution for just 1 yuan. ByteDance proudly announced that this price is 85% lower than the industry average, offering unprecedented value to business users.

Regarding this release, Li Liang, Vice President of Douyin Group, shared his views on social media today. He emphasized, "Our intention is not to start a price war. The reason why the Doupao large model can achieve such a low price is due to the technological innovation behind it. We have made significant optimizations in algorithms, software engineering, and hardware solutions, allowing us to maintain a reasonable profit margin even at 3 cents per thousand tokens. More importantly, we offer a straightforward and transparent pricing strategy, eliminating the complex 'list price + discount' model commonly seen in the industry. Our goal is to promote the widespread adoption and development of AI technology, as Tan Dai said, 'An excellent model should be affordable for every business.'"

Additionally, the conference marked the debut of the Doupao 3D Generation Model. This model, combined with Volcano Engine's digital twin platform veOmniverse, can perform a series of complex tasks, including intelligent training, data synthesis, and digital asset creation. ByteDance officially describes it as "a physical world simulation suite specifically designed for AIGC creation," bringing new possibilities to digital content production.

Among the various products in the Doupao large model family, several have received significant updates. The Doupao General Model Pro has now fully aligned with GPT-4 standards but is priced at just one-eighth of the latter, offering an extremely cost-effective option. The Music Model has added the capability to generate 3-minute complete works, injecting new vitality into the music creation field. The Text-to-Image Model 2.1 version has further improved accuracy, not only generating Chinese characters but also enabling one-sentence image editing. It has been successfully integrated into JIMENG AI and the Doupao App, providing users with more convenient creative tools.