Nearly 1,000 questions across diverse datasets used to
measure answer reliability and response time
BOSTON, July 18,
2024 /PRNewswire/ -- Just months after
demonstrating answer quality superiority over OpenAI,
Google, Amazon, Cohere, and others, CustomGPT.ai again excelled in
RAG benchmark analysis comparing its generative AI platform
with OpenAI's Assistant API V2. In
testing involving 945 questions across nine diverse data sets,
CustomGPT.ai outperformed OpenAI by achieving a 10 percent lower
hallucination rate, 13 percent higher accuracy rate, and 34 percent
faster average response time.
"In today's AI race, companies must adopt an 'anti-hallucination
first' focus," said CustomGPT.ai founder and CEO Alden Do Rosario.
"We founded our company on this premise, and we're thrilled new
research further validates our technology, especially for the
6,000-plus customers we now serve."
As organizations bring AI into their operations, they take on
responsibility for the information it generates. "To reduce risk,
entities should adequately vet foundational AI technology and use
solutions that are proven," added Do Rosario.
As skeptics in both B2C and B2B question AI's reliability,
precision, and performance, Do Rosario believes these findings will
especially resonate in industries where accuracy is paramount such
as legal sectors, finance, healthcare, and education.
Hallucinations -- instances where AI generates information not
grounded in reality or provided context -- can contribute to
misinformed decision-making, compliance issues, safety risks, and
erosion of trust in AI. This research highlights nuanced
differences in context and reliability by showing more
comprehensive and accurate responses generated by CustomGPT.ai
compared to OpenAI's answers which often lack detail or completely
miss the mark.
Validated by Tonic.ai, a pioneer in data mimicking and
de-identification, the research supports the use of
Retrieval-Augmented Generation (RAG) to help mitigate AI
hallucinations and support delivery of more precise and reliable
information.
Foundationally, this benchmark went far beyond recent
research by using 945 rather than 55 questions and testing
against nine datasets spanning topics from public health to
literature rather than one single dataset. An 'answer consistency
binary' metric was also used whereby any deviation from the
expected answer resulted in a failed response.
Do Rosario said this research significantly ups the ante for
statistical significance, data diversity, and scoring rigor.
"Gone are the days of organizations needing to settle for
chatbots that generate inaccurate responses, especially from
short-sighted, underperforming, or overpriced AI vendors," he
stated. "The future is wide open for gen AI to responsibly deliver
comprehensive and contextually accurate information in order to
truly help organizations advance decision-making
capabilities, improve operational efficiency and increase
revenues."
About CustomGPT.ai
CustomGPT.ai offers a novel, business-grade, privacy-first,
no-code/low-code generative AI platform. The technology makes it
quick, easy, and affordable for anyone — regardless of technical
expertise — to ingest their own content and data, to build custom
bots and other GPT agents, and to deploy these solutions with
confidence. CustomGPT.ai leverages advanced large language models
(including OpenAI's GPT-4) to offer the industry's best accuracy
and anti-hallucination protection. Nearly 6,000 entities rely on
CustomGPT.ai to deliver SOC2-compliant solutions that improve
operational efficiency, enhance customer engagement, and increase
sales – including Adobe, the Massachusetts
Institute of Technology, the Dominican Republic's GPTLegal, and the UK's
DivorceOnline. REST APIs and SDKs are available for
developers, ISVs, digital agencies, and resellers. Visit
https://customgpt.ai or contact hello@customgpt.ai.
Contact:
Beth Strohbusch
beth@customgpt.ai
(414) 213-8818
View original
content:https://www.prnewswire.com/news-releases/anti-hallucination-benchmark-confirms-customgptais-continued-leadership-in-generative-ai-answer-accuracy-302200993.html
SOURCE CustomGPT.ai