Between Dreams and Reality: Generative Text and Hallucinations

This is an in-depth dive into hallucinations in LLMs. See the illusions cast by modern AI generative models like ChatGPT, Bard and Claude.

By Josep Ferrer, KDnuggets AI Content Specialist on November 7, 2023 in Language Models

Between Dreams and Reality: Generative Text and Hallucinations

Image generated by DALL-E

In the digital age, the marvels of artificial intelligence have transformed the way we interact, work, and even think.

From voice assistants that curate our playlists to predictive algorithms that forecast market trends, AI has seamlessly integrated into our daily lives.

But as with any technological advancement, it’s not without its twists.

A large language model or LLM is a trained machine learning model that generates text based on the prompt you provided. In order to generate good responses, the models take advantage of all the knowledge retained during its training phase.

Recently, LLMs have shown impressive and increasing capabilities, including generating convincing responses to any type of user prompts.

However, even though LLMs have an incredible ability to generate text, it is hard to tell if this generation is accurate or not.

And this is precisely what is commonly known as hallucinations.

But what are these hallucinations, and how do they impact the reliability and utility of AI?

The Enigma of LLM Hallucinations

LLMs are masterminds when it comes to text generation, translations, creative content, and more.

Despite being potent tools, LLM do present some significant shortcomings:

The decoding techniques employed can yield outputs that are either uninspiring, lacking coherence, or prone to falling into monotonous repetitions.
Their knowledge foundation is “static” in nature, presenting challenges in seamless updates.
A common issue is the generation of text that is either nonsensical or inaccurate.

The last point is referred to as hallucination, which is an AI-extended concept from humans.

For humans, hallucinations represent experiences perceived as real despite being imaginary. This concept extends to AI models, where the hallucinated text appears accurate even though it's false.

In the context of LLMs, “hallucination” refers to a phenomenon where the model generates text that is incorrect, nonsensical, or not real.

Image by Dall-E

LLMs are not designed like databases or search engines, so they don’t reference specific sources or knowledge in their answers.

I bet most of you might be wondering… How can it be possible?

Well… these models produce text by building upon the given prompt. The generated response isn’t always directly backed by specific training data, but is crafted to align with the context of the prompt.

In simpler terms:

They can confidently spew out information that’s factually incorrect or simply doesn’t make sense.

Deciphering the Types of Hallucinations

Identifying hallucinations in humans has always posed a significant challenge. This task becomes even more complex given our limited ability to access a reliable baseline for comparison.

While detailed insights like output probability distributions from Large Language Models can aid in this process, such data is not always available, adding another layer of complexity.

The issue of hallucination detection remains unsolved and is a subject of ongoing research.

The Blatant Untruths: LLMs might conjure up events or figures that never existed.
The Overly Accurate: They might overshare, potentially leading to the spread of sensitive information.
The Nonsensical: Sometimes, the output might just be pure gibberish.
Why Do These Hallucinations Occur?

Why Do These Hallucinations Occur?

The root cause lies in the training data. LLMs learn from vast datasets, which can sometimes be incomplete, outdated, or even contradictory. This ambiguity can lead them astray, making them associate certain words or phrases with inaccurate concepts.

Moreover, the sheer volume of data means that LLMs might not have a clear “source of truth” to verify the information they generate.

Using Hallucinations to Your Advantage

Interestingly, these hallucinations can be a boon in disguise. If you’re seeking creativity, you’d want LLMs like ChatGPT to hallucinate.

Image generated by DALL-E

Imagine asking for a unique fantasy story plot, you’d want a fresh narrative, not a replica of an existing one.

Similarly, when brainstorming, hallucinations can offer a plethora of diverse ideas.

Mitigating the Mirage

Awareness is the first step towards addressing these hallucinations. Here are some strategies to keep them in check:

Consistency Checks: Generate multiple responses to the same prompt and compare.
Semantic Similarity Checks: Use tools like BERTScore to measure the semantic similarity between generated texts.
Training on Updated Data: Regularly update the training data to ensure relevancy. You can even fine-tune the GPT model to improve its performance in some specific fields.
User Awareness: Educate users about potential hallucinations and the importance of cross-referencing information.

And the final one, but not least… EXPLORE!

This article has laid the groundwork regarding LLM hallucinations, yet the implications for you and your application might diverge considerably.

Moreover, your interpretation of these phenomena may not precisely correspond with actuality. The key to fully grasping and valuing the impact of LLM hallucinations on your endeavors is through an in-depth exploration of LLMs.

In Conclusion

The journey of AI, especially LLMs, is akin to sailing in uncharted waters. While the vast ocean of possibilities is exciting, it’s essential to be wary of the mirages that might lead us astray.

By understanding the nature of these hallucinations and implementing strategies to mitigate them, we can continue to harness the transformative power of AI, ensuring its accuracy and reliability in our ever-evolving digital landscape.

Josep Ferrer is an analytics engineer from Barcelona. He graduated in physics engineering and is currently working in the data science field applied to human mobility. He is a part-time content creator focused on data science and technology. Josep writes on all things AI, covering the application of the ongoing explosion in the field.