Contextual AI’s grounded GLM outperforms GPT-4o on factual accuracy, highlighting why enterprise-focused AI matters

Contextual AI’s grounded GLM outperforms GPT-4o on factual accuracy, highlighting why enterprise-focused AI matters

Contextual AI today introduced its grounded language model (GLM), a development the company says achieves unparalleled factual accuracy by outperforming leading AI systems from major players in the field on a core truthfulness benchmark. The startup, founded by pioneers of retrieval-augmented generation (RAG) technology, reported that its GLM registered an 88% factuality score on the FACTS benchmark. By comparison, Google’s Gemini 2.0 Flash scored 84.6%, Anthropic’s Claude 3.5 Sonnet 79.4%, and OpenAI’s GPT-4o 78.8%. This performance signals a deliberate shift toward solutions designed to excel in enterprise environments where precision matters most. The announcement underscores Contextual AI’s mission to address the persistent challenge of AI hallucinations in business settings by delivering a model that is purpose-built for enterprise RAG applications where factual correctness is non-negotiable.

Contextual AI’s leadership emphasizes that the path to reliable enterprise AI is not simply about bigger language models or more creative outputs. Instead, it rests on a refined approach to retrieval-augmented generation that integrates robust data grounding with optimization across all system components. Douwe Kiela, the CEO and cofounder of Contextual AI, spoke in depth about the company’s philosophy in an exclusive conversation, describing how the team’s roots in RAG guided the development. He explained that the company’s work centers on “doing RAG the right way, to the next level of doing RAG.” This commitment reflects a broader trend in the AI industry: organizations are increasingly seeking models that maximize factual reliability in enterprise contexts, rather than models that prioritize broad generality or high creative flexibility alone.

Contextual AI makes a clear distinction between its GLM and general-purpose models like ChatGPT or Claude. While broad models aim to handle a wide spectrum of tasks—from creative writing to technical documentation—Contextual AI targets high-stakes enterprise environments where accuracy is critical and the tolerance for error is extremely low. In regulated industries such as finance, healthcare, and telecommunications, the ability of an AI system to stay grounded in verifiable information or to acknowledge clearly when it does not know something is essential. Kiela emphasized that a system with a traditional language model designed for marketing advantages or broad appeal is not suitable for enterprise settings where the cost of mistakes can be substantial. This framing sets the stage for Contextual AI’s emphasis on reliability and accountability in AI-assisted decision making.

A key theme in Contextual AI’s narrative is the concept of grounding as a new standard for enterprise language models. Groundedness refers to the mechanism by which AI responses are tethered to information explicitly provided within the context or to reliable, verifiable sources. In highly regulated domains, the AI’s ability to provide precise, context-consistent answers—or to transparently admit uncertainty—matters more than the breadth of possible outputs. Kiela offered a practical illustration: when asked to work from a recipe or a formula, a standard language model might propagate a general truth claim without sufficient caveat. In contrast, Contextual AI’s model would explicitly note that a statement is true only for most cases, not universally, thereby capturing an additional layer of nuance. The ability to say “I don’t know” is presented as a powerful feature in enterprise settings because it reduces the risk of confidently incorrect responses. This stance on groundedness aligns with the broader push in enterprise AI to replace vague, overconfident outputs with reliable, auditable results.

Within this grounded framework, Contextual AI has introduced an architectural approach branded as “RAG 2.0.” This approach aims to move beyond assembling off-the-shelf components in a piecemeal fashion. The company has described a traditional RAG stack as consisting of a frozen embedding model, a vector database for retrieval, and a black-box language model for generation, all connected through prompting or orchestration layers. From Contextual AI’s perspective, this configuration often resembles a Frankenstein’s monster—a set of parts that individually function but do not operate in optimal coordination, resulting in inefficiencies and potential reliability gaps. The RAG 2.0 paradigm represents a deliberate shift toward end-to-end optimization across the system. It emphasizes the joint tuning and integration of all elements so that retrieval, ranking, grounding, and generation work in a harmonized cycle.

A core component of RAG 2.0 is Contextual AI’s “mixture-of-retrievers” capability. This feature is described as a sophisticated mechanism for intelligent retrieval that first analyzes the question, then strategically plans how to search for relevant information. The process resembles a strategic thinking step: the system plans a retrieval strategy before pulling data, ensuring that retrieval is aligned with the task’s requirements rather than simply fetching whatever is most accessible. This planning stage is followed by the application of the best available re-ranking to prioritize the most relevant information before it feeds into the grounded language model. The emphasis on a high-quality re-ranker is framed as “the best re-ranker in the world” by the company, reflecting the belief that prioritizing information quality at this stage is crucial for producing accurate, trustworthy outputs. The entire pipeline thus operates in a tightly coordinated fashion, with grounding that ensures the final text remains anchored to the most relevant, verified sources.

Beyond the grounding of text outputs, Contextual AI has expanded its platform to support multimodal content. While the GLM focuses primarily on text generation, the company has integrated capabilities to handle charts, diagrams, and structured data from popular data platforms. The platform now reads charts and connects to databases from widely used systems such as BigQuery, Snowflake, Redshift, and Postgres. This multimodal capacity addresses a long-standing enterprise challenge: the most compelling insights often lie at the intersection of unstructured information (like documents and policies) and structured data (like rows in a database or transactional records). Kiela highlighted that the most impactful enterprise problems sit precisely at this junction, where structured and unstructured data converge to inform decision-making. The platform already demonstrates capacity for complex visualizations, including circuit diagrams in the semiconductor sector, illustrating how Contextual AI can translate domain-specific visuals into actionable, grounded analysis.

The broader trajectory for Contextual AI includes a clear roadmap for expanding the scope and reliability of its tools. The company plans to release its specialized re-ranker component in short order following the GLM release, followed by expanded document-understanding capabilities. In addition to these releases, the company is exploring experimental features designed to introduce more agentic capabilities, signaling a potential shift toward autonomous, context-aware assistance within enterprise workflows. Contextual AI was founded in 2023 by Douwe Kiela and Amanpreet Singh, a figure with experience at Meta’s Fundamental AI Research (FAIR) team and Hugging Face. The company has already secured a growing customer base that includes HSBC, Qualcomm, and The Economist, underscoring the interest of large, global enterprises in grounded, highly reliable AI solutions. The leadership positions this technology as a means for organizations to finally realize tangible returns on their AI investments through more dependable, context-aware AI systems.

From a business perspective, Contextual AI frames this development as timely for enterprises seeking ROI from AI deployments. The company’s messaging emphasizes a shift away from purely broad capabilities toward specialized, problem-focused solutions that address real-world enterprise pain points. The grounded GLM, by prioritizing accuracy and reliability, is presented as a more cost-effective path to deploying AI at scale within regulated industries. The CEO framed this as part of a broader trend where organizations under pressure to deliver returns on AI investments should consider solutions designed to reduce risk while improving trust and accountability in automated processes. The perspective is that a “boring” but highly reliable model may outperform flashier alternatives when accuracy, compliance, and auditability are paramount.

In conjunction with these product developments, Contextual AI has signaled a strong emphasis on practical enterprise deployment considerations. The platform’s capacity to read and interpret charts and to connect with enterprise data sources is positioned as a differentiator in the market, enabling more precise reasoning over both textual narratives and data-driven facts. The focus on groundedness also has implications for governance, risk management, and compliance, since a clearly defined boundary around what the AI can and cannot claim helps organizations maintain verifiable accountability for automated outputs. The company’s outreach to potential clients emphasizes the ROI potential of specialized, grounded AI solutions that align with business processes rather than forcing organizations to retrofit existing workflows around generic AI capabilities.

Contextual AI’s business narrative also highlights the relevance of its approach to highly regulated industries where there is little tolerance for hallucinations. The GLM’s grounding ensures that outputs remain anchored in the live context of a given domain, reducing the risk of misinformed decisions. By integrating retrieval planning, sophisticated re-ranking, and end-to-end grounding, the platform aims to provide enterprise-grade reliability. The multimodal capability extends this reliability to diverse data formats, ensuring that key information embedded in charts, diagrams, and structured datasets informs decision making rather than being ignored or misrepresented by a text-only model. In this sense, Contextual AI is positioning its technology not merely as a better language model but as a more dependable information system, capable of supporting governance, compliance, and operational excellence across a range of industries.

As part of its ongoing market strategy, Contextual AI has publicly referenced real-world deployments with major organizations, noting adoption by HSBC, Qualcomm, and The Economist. The company frames these engagements as evidence that enterprises are increasingly seeking out specialized AI solutions that address their unique problems and demand robust grounding, auditability, and reliability. The emphasis on ROI reflects a practical mindset: organizations want to invest in AI translates to measurable improvements in efficiency, risk reduction, and business outcomes. The leadership’s emphasis on ROI is designed to reassure potential customers that grounded GLM solutions are not an abstract experiment but a mature platform capable of integrating into existing enterprise ecosystems, respecting compliance constraints, and delivering consistent performance over time.

In summary, Contextual AI’s GLM represents a concerted effort to redefine enterprise AI through groundedness, integrated architecture, and data-aware capability. By coupling retrieval-augmented strategies with a robust re-ranking framework and end-to-end optimization, the company seeks to minimize hallucinations and enhance reliability in high-stakes settings. The addition of multimodal support expands the utility of the platform beyond text to a richer, data-driven decision-support tool. With a roadmap that includes releasing a specialized re-ranker, expanding document understanding, and exploring agentic features, Contextual AI positions itself as a forward-looking player in the enterprise AI landscape. The founders’ track records, combined with a growing roster of marquee customers, reinforce the company’s narrative that ROI-focused, grounded AI is not only possible but ready for broader adoption. As enterprises continue to grapple with the trade-offs between capability, reliability, and governance, Contextual AI’s approach provides a compelling blueprint for how to achieve practical, trustworthy AI at scale.

Conclusion

Contextual AI’s announced GLM represents a deliberate shift toward enterprise-grade AI that prioritizes factual accuracy, grounded reasoning, and seamless integration with structured data and visualization formats. By emphasizing groundedness—the capability to anchor responses in provided context and to acknowledge uncertainty—Contextual AI aims to reduce hallucinations that have hindered enterprise adoption. The RAG 2.0 architecture described by the company offers a holistic approach that aligns retrieval strategy, re-ranking, and grounding with generation, moving away from piecemeal assembly toward a tightly coordinated system. The multimodal enhancements broaden the platform’s applicability to real-world business problems where charts, diagrams, and databases carry critical information that must inform AI outputs. The roadmap signals ongoing evolution, including the release of a dedicated re-ranker, more extensive document understanding, and early experiments with agentic capabilities—all designed to bolster reliability, auditability, and ROI for enterprise customers. With a growing list of high-profile clients and a leadership team with deep roots in foundational AI research and open-source ecosystems, Contextual AI stands as a noteworthy contender in the ongoing effort to deliver practical, trustworthy AI solutions that meet the stringent demands of enterprise environments.

Companies & Startups