Contextual AI has unveiled its Grounded Language Model (GLM), asserting that it surpasses leading AI systems from Google, Anthropic, and OpenAI in crucial factuality benchmarks.
According to the startup company, which was founded by pioneers in retrieval-augmented generation (RAG) technology, its GLM achieved an 88% factuality score on the FACTS benchmark. In comparison, Google's Gemini 2.0 Flash scored 84.6%, Anthropic’s Claude 3.5 Sonnet reached 79.4%, and OpenAI’s GPT-4o garnered 78.8%.
Although large language models have significantly transformed enterprise software, inaccuracies in factual information—often referred to as "hallucinations"—remain a major obstacle for businesses adopting these technologies. Contextual AI seeks to address this issue by developing a model specifically optimized for enterprise RAG applications, where precision is of utmost importance.
The CEO and co-founder of Contextual AI stated that the company is dedicated to maximizing the potential of RAG technology. Unlike general-purpose models such as ChatGPT or Claude, which are designed to handle a variety of tasks ranging from creative writing to technical documentation, Contextual AI focuses on high-stakes enterprise environments where factual accuracy outweighs creative flexibility.
In highly regulated industries, there is zero tolerance for hallucinations. Consequently, generic language models suitable for marketing departments are not appropriate for error-sensitive corporate settings.