The highly anticipated 2024 Baidu Cloud Intelligence Conference successfully concluded in Beijing, focusing on the latest advancements and future trends in the artificial intelligence sector. At this grand event, Baidu Intelligent Cloud announced comprehensive upgrades to two core AI infrastructures—the Baige AI Heterogeneous Computing Platform 4.0 and Qianfan Large Model Platform 3.0—and simultaneously launched new versions of three native AI application products, including Code Assistant, Intelligent Customer Service, and Digital Humans.
Shen Dou, President of Baidu Intelligent Cloud Business Group, stated at the conference, "As large model technology continues to mature, its industrial applications are advancing at an unprecedented pace. To date, the Wenxin large model has surpassed 700 million daily invocations on the Qianfan platform, successfully assisting users in fine-tuning 30,000 large models and developing over 700,000 enterprise-level applications. Notably, over the past year, the price of the Wenxin flagship large model has decreased by more than 90%, and the main models are now available for free, significantly lowering the barriers for enterprises to adopt AI."
Baige 4.0: A New Era of Compute Power Empowering the Entire Large Model Lifecycle
To comprehensively address the diverse compute requirements of enterprises deploying large models, Baidu Intelligent Cloud introduced the Baige AI Heterogeneous Computing Platform 4.0. This platform has been fully optimized for ultra-large-scale clusters ranging from ten thousand to one hundred thousand GPUs, enhancing the entire chain of compute management from cluster creation and development experiments to model training and inference.
During the cluster creation phase, Baige 4.0 comes pre-installed with mainstream large model training tools, enabling tool-level deployments within seconds and reducing the preparation time for ten thousand GPU clusters from several weeks to just one hour. In the development experiment phase, its newly upgraded observability dashboard provides comprehensive monitoring of multi-core adaptation, cluster efficiency, and automatic task fault tolerance, helping enterprises quickly formulate optimal model training strategies.
In the model training phase, Baige 4.0 significantly reduces the frequency of failures by automatically screening cluster statuses, predicting GPU faults, and instantly transferring workloads. Combined with second-level fault detection and localization, as well as Flash Checkpoint technology, it further shortens the cluster fault resolution time. Currently, effective training time on ten thousand GPU clusters exceeds 99.5%, with overall performance improving by 30% compared to industry averages.
During the model inference stage, Baige 4.0 leverages innovative technologies such as architecture separation, KV Cache, and load distribution to reduce costs and enhance efficiency, particularly doubling the inference efficiency for long-text processing. Additionally, the platform has built a high-performance network (HPN) with no congestion at the hundred thousand GPU level, ultra-precise network monitoring at the 10ms level, and minute-level fault recovery capabilities for hundred thousand GPU clusters, ensuring the stability and efficiency of large model applications.
Qianfan 3.0: Comprehensive Upgrades in Model Invocation, Development, and Application
To meet the diverse needs of enterprises for model invocation, development, and application, Baidu Intelligent Cloud released Qianfan Large Model Platform 3.0. This platform not only supports the invocation of nearly a hundred large models, including the Wenxin series from both domestic and international providers, but also added capabilities for invoking traditional smaller models in areas such as voice and vision. Moreover, over the past year, the price of the Wenxin flagship large model has decreased by more than 90%, and the main models are now available for free.
In terms of model development, Qianfan 3.0 offers a comprehensive large model toolchain, supporting customization and fine-tuning of traditional models in areas like computer vision (CV), natural language processing (NLP), and voice. It also achieves unified management and scheduling of data, models, and compute resources. Additionally, the platform allows enterprises to feedback data generated from applications through sampling evaluations and manual annotations, creating a data flywheel effect to continuously optimize model performance.
For enterprise-level application development, Qianfan 3.0 has comprehensively upgraded RAG (Retrieval-Augmented Generation) across multiple dimensions, including retrieval effectiveness, performance, storage scalability, and flexible allocation. To enhance the development efficiency of enterprise-level Agents, Qianfan 3.0 also added support for over 80 official components, including business self-orchestration, manual orchestration, and knowledge injection.
Comprehensive Upgrades to Three Native AI Application Products
Baidu Intelligent Cloud also fully upgraded three native AI application products—Code Assistant, Intelligent Customer Service, and Digital Humans—to meet the needs of enterprises directly purchasing mature AI applications.
· Intelligent Customer Service "Keyue": Reconstructed based on the Wenxin large model, "Keyue" has achieved significant improvements in complex intent understanding and multimodal information exchange, increasing the self-service resolution rate from the industry average of 80% to 92%. To date, "Keyue" has served over 150 million users and handled more than 500 million interactions.
· Xiling Digital Human 4.0: The newly upgraded Xiling Digital Human not only supports the rapid generation of diverse 3D avatars and videos from text but also addresses the rigid movements of traditional 2D digital humans through 4D auto-binding technology and innovative modal transfer techniques. Additionally, the Xiling platform has drastically reduced the price of ultra-realistic 3D digital humans from ten thousand yuan to 199 yuan, further lowering the adoption barriers for enterprises.
· Wenxin Quick Code: The newly upgraded Wenxin Quick Code introduces two major features: "Enterprise-Level Code Architecture Explanation" and "Enterprise-Level Code Review." These features can intelligently interpret engineering architecture, inherit the coding expertise of senior engineers, and deeply understand enterprise codebases, helping enterprises improve overall R&D efficiency by over 20%. Currently, Wenxin Quick Code serves more than 10,000 enterprise clients.
The convening of this conference marks a new milestone in Baidu's technological prowess and application capabilities in the AI field, bringing more efficient and intelligent AI solutions to a wide range of enterprise users.