Question 1

What exactly does it mean when an AI model hallucinates?

Accepted Answer

It means the model generated something — a fact, a name, a citation, a code snippet, a statistic — that is incorrect or entirely fabricated, presented with the same confident tone as accurate output. The model is not "lying" in any intentional sense; it is producing text that fits the pattern of a correct-sounding answer without any mechanism to verify whether the underlying claim is true.

Question 2

How can I use ModelPicker to pick a model less prone to hallucination?

Accepted Answer

Focus on two scores: faithfulness and safety calibration. Faithfulness measures how accurately a model represents source material it was given — critical for summarization, document QA, and RAG. Safety calibration measures whether a model acknowledges uncertainty rather than inventing an answer. In our testing across 52 models, safety calibration has a median of just 2/5, so filtering for models that score 4 or 5 on this dimension meaningfully narrows the field. Models that score well on both are the safest choice for accuracy-sensitive applications.

Question 3

Does a higher price mean less hallucination?

Accepted Answer

Not reliably. The 52 models we track range from $0.05 to $5.00 per million input tokens, and price does not map cleanly to faithfulness or safety calibration scores in our benchmarks. Some budget models perform well on faithfulness; some premium models score poorly on safety calibration. Check the scores directly rather than using price as a proxy for accuracy.

Question 4

Is hallucination the same as a model making a reasoning error?

Accepted Answer

They overlap but are distinct. A reasoning error means the model followed a flawed logical path to a wrong conclusion — the process is transparent and traceable. Hallucination specifically refers to the model asserting facts that have no basis, often without any reasoning chain at all. In practice, both produce wrong outputs, but hallucination is harder to catch because the model typically presents fabricated claims with the same confidence as correct ones.

Hallucination

What Is It?

Why It Matters

How It Applies

Frequently Asked Questions