New Anthropic technology boosts RAG accuracy by 67%

2024-09-23

Anthropic has unveiled a novel approach called "Contextual Retrieval," which significantly enhances AI systems' ability to access and leverage information from extensive knowledge bases. This innovation addresses a major limitation inherent in traditional Retrieval-Augmented Generation (RAG) systems.

Contextual Retrieval tackles a fundamental issue in RAG (Retrieval-Augmented Generation) systems: the loss of context when documents are divided into smaller segments for processing. By incorporating relevant contextual information before embedding or indexing each segment, this method preserves essential details that might otherwise be lost.

In practice, this involves utilizing Anthropic's Claude model to generate block-specific context. For instance, a basic statement like "Company revenue increased by 3% compared to the last quarter," when processed with contextualization, includes additional details such as the specific company name and the relevant time frame. This enriched context ensures that the retrieval system can more accurately identify and utilize the correct information.

The technology incorporates two essential components: Contextual Embeddings and Contextual BM25. These elements work in tandem to significantly reduce retrieval failures—instances where the AI fails to locate the most relevant information.

Alex Albert, Anthropic's Head of Developer Relations, highlighted the significance of this advancement: "Contextual Retrieval has reduced the error rate in block retrieval by up to 67%. When combined with instant caching, it may become one of the best technologies for achieving retrieval in RAG applications."

We are excited to share the latest research findings on Contextual Retrieval—this technology has decreased the error rate in block retrieval by as much as 67%.

When integrated with instant caching, it may be one of the top technologies for retrieval in RAG applications.
Let me elaborate: pic.twitter.com/UG1H8DLMxZ
— Alex Albert (@alexalbert__) September 19, 2024

These enhancements are substantial:

- Contextual Embeddings alone have reduced retrieval failure rates by 35%

- Combining Contextual Embeddings with Contextual BM25 has decreased failure rates by 49%

- Adding a re-ranking step on top of these technologies has further reduced failure rates by 67%

These improvements in accuracy directly translate to better performance in downstream tasks, potentially enhancing the quality of AI-generated responses across various applications.

Anthropic's research demonstrates that Contextual Retrieval performs exceptionally well across diverse knowledge domains, including code repositories, novels, scientific papers, and financial documents. While various embedding models are utilized, this technology consistently shows improvement, with models like Gemini and Voyage proving particularly effective.

One of the key innovations enabling Contextual Retrieval is instant caching, a technology the company announced last month, significantly reducing implementation costs. The estimated cost of contextualization is only $1.02 per million document tokens. This cost-effectiveness makes Contextual Retrieval suitable for large-scale applications, where manual annotation of blocks would be impractical.

If you are interested in implementing Contextual Retrieval, Anthropic has released a comprehensive guide outlining the process. The company encourages experimentation with this technology and notes that custom prompts tailored to specific domains may yield even better results.