DeepSeek-R1 & V3 API Upgraded with Significant Price Reduction

2025-03-13

Starting today, SiliconCloud has officially launched the Batch Inference feature for its DeepSeek-R1 and V3API. This update represents a significant leap forward in the platform's data processing capabilities. With the new batch API, users are no longer constrained by real-time inference rates and can send requests to SiliconCloud, enabling more efficient handling of large-scale data tasks. It is reported that large-scale data processing tasks can be completed within 24 hours, providing users with a more convenient and flexible data processing solution.

A notable highlight of this feature update is the substantial price reduction. The cost of batch inference for DeepSeek-V3 is 50% lower compared to real-time inference, offering significant cost savings for users. More excitingly, during the special promotional period from March 11 to March 18, the batch inference price for DeepSeek-R1 is discounted by 75%, with input costs as low as 1 yuan per million tokens and output costs at 4 yuan per million tokens. This pricing strategy undoubtedly provides great appeal and value for users interested in trying out the batch inference feature.

The introduction of the batch inference feature aims to help users better handle large-scale data processing tasks such as generating reports and data cleaning, achieving higher efficiency and lower costs. This feature is particularly suitable for scenarios like data analysis and model performance evaluation where real-time responses are not required, allowing users to schedule data processing times more freely and focus on core business development.

Notably, prior to the launch of the batch inference feature, DeepSeek-R1 and V3API had already supported various functionalities such as Function Calling, JSON Mode, Prefix, and FIM. The richness and flexibility of these features enable the platform to meet diverse data processing needs. Additionally, the TPM (Tokens Per Minute) limit for the Pro version of DeepSeek-R1 and V3API has been increased from 10,000 to 1 million, further enhancing the platform's processing capacity and user experience.

With the launch of the batch inference feature and the significant price reduction, SiliconCloud will offer users more powerful, efficient, and convenient data processing services. In the future, the platform will continue to optimize and upgrade its features, providing users with comprehensive data processing solutions to support higher levels of business development and innovation.