In 2023, Cambridge Dictionary’s winner for Word of the Year was “hallucinate,” referring to the hallucinations of artificial intelligence (AI) systems. An AI system is said to hallucinate when it responds to a prompt with information that is factually inaccurate or misleading, and sometimes nonsensical or irrelevant to the context.
The term being officially coined speaks volumes about this alarming issue, which many people find hard to believe considering the otherwise spectacular performance of tools like ChatGPT. But the fact is, you can’t blindly trust AI-generated content, and that includes AI translations. All AI tools hallucinate, and they do so in unpredictable and sometimes undetectable ways. This post will help you better understand the problem and its consequences, as well as the best safeguards.
What causes AI hallucinations?
The term artificial intelligence doesn’t mean that AI tools can think. They’re systems driven by algorithms that draw on existing content bases called corpora. The output you get is the most statistically prevalent or likely answer or translation. The algorithm picks words or word combinations with the most occurrences in the corpus, disregarding less‑frequent words and combinations.
AI corpora typically comprise colossal amounts of material, which is both a strength and a weakness. Mass-market AI tools, for instance, use corpora built from both original and translated content scraped from all over the web, without any prior or subsequent review. As a result, the content is polluted from the outset with all kinds of linguistic and factual errors, misinformation, stereotypes, inconsistencies, and more. That is one of the main reasons why AI tools hallucinate.
Another key reason is that even gigantic corpora can’t be totally comprehensive. Inevitably, AI systems will fail to find certain words or word combinations, especially those that are unusual, complex, humorous or colloquial. The same goes for many proper nouns, such as the names of people or products.
So, what happens when an AI system finds no statistically reliable response or translation in its corpus?
If it’s a generative AI tool, it may respond that it doesn’t have enough information to come up with the answer or content you requested. Then again, it may also hallucinate, responding inaccurately with no warning whatsoever that the response may be flawed. Everything is stated matter-of-factly, as if it were 100% true and reliable. You might get a good laugh if the output is way out in left field, but some hallucinations can have serious consequences.
An example of disastrous hallucinations in generative AI
In 2023, a New York lawyer submitted a legal brief littered with “bogus judicial decisions, with bogus quotes and bogus internal citations” resulting from ChatGPT hallucinations. The story caused quite a stir, reaching all the way up to the U.S. Supreme Court.
Hallucinations also occur in AI translation. When translation systems don’t find any statistically reliable translation in their corpora, they’re not designed to let you know or ask for clarification. They sometimes leave “challenging” parts of the text untranslated, but more often than not, they do provide a translation. The problem is, that translation may be inaccurate. It doesn’t say what it’s supposed to say, even if there’s nothing wrong with it linguistically.
Three examples of hallucinations in AI translation
Prendre la route avec un véhicule non déneigé ou couvert de glace peut entraîner des amendes.
Driving a vehicle with no snow or ice on it can result in fines.
Though it’s well worded and seems fine at first glance, the translation says the exact opposite of the source message—which is that driving a vehicle covered with snow or ice can result in fines.
Auto theft is surging in Canada—and according to a recent report from Équité Association, it’s adding up to more than $1 billion in insurance claims.
Le vol d’automobiles est en hausse au Canada et, selon un rapport récent d’Équité Association, il s’agit d’un crime contre l’humanité qui représente plus d’un milliard de dollars en réclamations d’assurance.
The system adds something that doesn’t appear in the original content, namely that auto theft is a crime against humanity. This completely changes the message.
Discover the econ-friendly cities that excel in sustainability and create a better future for all.
Découvrez les villes respectueuses de l’économie qui excellent en matière de durabilité et créent un avenir meilleur pour tous.
Risks and consequences
MISINFORMATION
DISINFORMATION
STEREOTYPES
Reinforcing traditional stereotypes from decades-old corpus content that may perpetuate discrimination.
The best safeguards against hallucinations in AI translation
1 – Making the best use of the best tools
Professional language service providers use the very best AI tools, all of them thoroughly tested and carefully built into a secure IT infrastructure. They can offer a tool with exclusive access for your organization, its corpus curated with relevant content that’s been reviewed by language experts who specialize in your industry.
2 – Professionally reviewing AI translations
Trained language professionals review all AI translations to ensure they are delivered to you 100% free of those errors that result from hallucinations and may be so hard to detect.
Versacom is your ideal partner as you start thinking about AI translation and its safe, effective use for your organization. Our language, project management and technology experts work together to deliver the best results and maximum savings.