Enhancing ChatGPT and Gemini accuracy. Avoid AI hallucination.

Enhancing ChatGPT and Gemini Accuracy. Avoid Hallucination.

Table Of Contents

Enhancing ChatGPT and Gemini Accuracy. Avoid Hallucination.
1. Hallucination in AI chatbots
2. Add a Fact Check Toolkit
3. Ask for References
4. Logical Validation
5. Domain Expertise
6. Delete erroneous chats

1. Hallucination in AI chatbots

In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation or delusion) is a response generated by AI which contains false, in-accurate or misleading information generated and presented as fact ^[Reference]. In 2023, analysts estimated that chatbots ( Large language models (LLMs) like ChatGPT, Gemini ) hallucinate as much as 27% of the time, with factual errors present in 46% of their responses.

How do you ensure that your interaction with ChatGPT and other AI chatbot are accurate. How do you trust the output produced by it. Especially when you are relying on your AI chatbot to make critical decisions in personal life and professional work.

The blog describes practical steps you can take to ensure accuracy of your personal AI chatbot.

2. Add a Fact Check Toolkit

This technique is quite helpful while using AI chatbot for generic queries. While using AI chatbot to browse internet or provide general reasoning, ask it to add a fact check toolkit. To do this add the below prompt.

Whenever you generate output do the following, to enhance the factual grounding of your responses:

1. Analyze your output for statements that could be misconstrued if not entirely accurate.
2. Differentiate between:
*Objective facts (verifiable in external sources)
*Subjective interpretations or analyses (based on your reasoning)
3. For objective facts, create a “Fact Check Toolbox” at the end:
* Clearly state the fact
* Provide its source URL or suggest avenues for verification
4. For subjective parts, label them as “Analysis/Interpretation” to distinguish them from pure facts.

Attached is a sample run from ChatGPT. We asked it “Is Banana healthy for dinner?”. As you can see ChatGPT has created a Fact Check Toolkit with the URLs you can verify. In case you see any shoddy URLs its good to discard the response and start a fresh chat.

If you are using Gemini, you can do this easily by clicking ‘Double Check Response‘ button at the bottom of each chat. Gemini will color code and highlight the response based on below criteria.

3. Ask for References

This technique is quite useful when you are using AI chatbots for analyzing large documents and files (PDF, Word, Excel etc.). Just add this prompt before you ask AI chatbot to analyze a document.

Summarize the key points from the attached artefact. After each sentence in your summary,
* Cite the IDs (or provide quotations) of the information that support it.
* Cite the page number from where the key quotations can be referenced

Attached is a sample run from ChatGPT where we have asked it to summarize a annual report for a large multinational bank. As you can see ChatGPT not only summarized the key points but also provided key page references from the document where the information can be found. What you have to do now is randomly check some of these pages to see if the information was actually available there and if if was correctly interpreted by ChatGPT.

4. Logical Validation

One quick hack that you can use is ask your AI chatbot when it produces output to provide the steps it took to derive the output. This is quite helpful while dealing with reasoning and mathematical problems with AI chatbots. There are two popular techniques we can use here ReAct and Think-Step-by-Step.

ReAct (Reasoning and Acting) prompting, is a technique particularly useful when you want AI chatbot to break down the steps and reasoning behind the steps to derive the output. By analysing the reasoning and steps, you can check if the approach taken was correct and hence the accuracy of the output. The below is a example in mathematics domain but you can leverage the technique for other domains also.

ReAct Prompt Template for Math Problems
Task: [State the math problem clearly.]

Reasoning and Acting Steps:

Reasoning: [First logical step to break down the problem.]
Acting: [Perform the action based on the first reasoning step. Show the calculation or operation.]
Reasoning: [Next logical step to further break down the problem.]
Acting: [Perform the action based on the second reasoning step. Show the calculation or operation.]
Reasoning: [Continue breaking down the problem into logical steps as needed.]
Acting: [Perform the corresponding actions for each step. Show all calculations or operations.]
Reasoning: [Final reasoning step to consolidate all the intermediate results.]
Acting: [Perform the final action to arrive at the solution. Show the final calculation or operation.]
Final Calculation:

[Summarize and consolidate all the calculations to provide the final answer.]

Attached is a sample run from ChatGPT using ReAct prompt. We have used a simple math problem ‘How many prime number between 1 to 10000’. As you can see ChatGPT has broken the problem statement into four logical steps with appropriate reasoning behind them. This gives us confidence that ChatGPT has taken the right approach in deriving the output and hence chances of accuracy is higher.

ReAct prompt example in ChatGPT — ReAct prompt example with ChatGPT

There is an alternative and easier technique to achieve the same outcome. Here we just add the line ‘Think Step by Step‘ at the end of the prompt. The below run is from ChatGPT with the same mathematical problem as used is ReAct example above.

Prompt: How many prime numbers are there between 1 and 10,000? Think Step by Step.

Think Step by Step prompt in ChatGPT — Think Step by Step prompt example with ChatGPT

5. Domain Expertise

As a best practice always leverage AI chatbots primarily in your area of domain expertise if you are relying on the output for making critical decisions.
For e.g. if you are a coder with expertise in Python coding, you can ask AI chatbot inputs in Python (Coding, Debugging etc.). As its your area of expertise you can always sense if the output possibly has any inaccuracy and take further steps to revalidate. However if you using AI Chatbot for a language you have no coding expertise at all for e.g. say Scala, Cobol etc. you should be double careful in leveraging the output for any executions.
Same implies for other domains. If you are a carpenter, physics teacher, dentist etc. try leveraging AI chatbot in your domain of expertise.

Also when you start leveraging AI chatbot, be very specific when you provide the prompt (question). The more specific input you provide, the higher the chances in accuracy of the output. For e.g. you are a financial advisor and you want AI chatbot to analyze the profit loss statement from company’s 500 page annual report PDF . Instead of asking AI chatbot to read the whole document, tell it specifically to read page X to Y ( which contains the financial numbers) only and analyze the profit & loss statements. Refer my other blogs on prompt engineering on how you can write precise and specific prompts for a better user experience with AI chatbots.

6. Delete erroneous chats

Lastly if you find or suspect that your AI chatbot has provided some error output which is not factually correct, its always a best practice to delete or close the current chat session and start a fresh one. This is because some of the inputs will be part of the chatbot memory and if you continue using the same chat there is a chance that the subsequent chats also get contaminated. This is especially true while using AI chatbot for analyzing large set of documents and artifact’s. A new chat session will enforce a new memory from scratch.

Disclaimer!

LLM like ChatGPT, Gemini can provide incorrect and inaccurate outputs. Always double check the output before you use it.

Have a question?

If you have any other queries, feel free to drop a comment.

Learn More !

Experiment directly with ChatGPT and Google Gemini.
Want to learn more about effective prompts to get the best out of GenAI and LLMs?

Home: Prompt Engineering

Discover more from Debabrata Pruseth

Subscribe to get the latest posts sent to your email.