Game-changing breakthrough means AI can finally admit 'I don't know'

April 26, 2026

Ever asked an AI a question and got a super confident answer, only to find it was totally wrong? It's a bit like that loudmouth in the pub who knows everything, even when they don't. Well, scientists at MIT have just dropped a bombshell, teaching AI models to finally say 'I'm not sure' – making them far more reliable.

The problem with super-confident AI

Today's most advanced AI reasoning models are often overconfident. They spout every answer with unshakable certainty, whether they're actually right or just guessing.

Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) have pinpointed this overconfidence.

They've traced it to a flaw in how these models are trained, and now they've fixed it without losing any accuracy.

The 'I'm not sure' solution

Their new technique, called Reinforcement Learning with Calibration Rewards (RLCR), trains language models to give a confidence score alongside their answers. So, they don't just answer; they also tell you how certain they are.

In tests, RLCR reduced calibration error by a massive 90%.

Crucially, it maintained or even improved accuracy on both familiar and brand-new tasks.

Why AI acts like a know-it-all

It turns out the issue comes from a surprisingly simple source: current training methods for AI models reward correct answers but heavily penalise wrong ones. There’s nothing in between.

This means a model that guesses correctly gets the same praise as one that carefully reasoned its way to the right answer. Over time, this teaches AI to confidently answer everything, even when it’s effectively flipping a coin.

This can be dangerous. Imagine an AI used in medicine or finance that claims 'I'm 95 percent sure' but is only right half the time. Users have no clue they should seek a second opinion.

Mehul Damani, an MIT PhD student and co-lead author, said: "The standard training approach is simple and powerful, but it gives the model no incentive to express uncertainty or say I don’t know. So the model naturally learns to guess when it is unsure."

How they fixed it

RLCR tackles this by adding a 'Brier score' to the training process. This score penalises the difference between what the model says it knows and its actual accuracy.

Confidently wrong answers are hit, but so are correct answers given with unnecessary uncertainty.

Isha Puri, another MIT PhD student and co-lead author, highlighted the issue: "What’s striking is that ordinary RL training doesn't just fail to help calibration. It actively hurts it. The models become more capable and more overconfident at the same time."

The team found that even the act of reasoning about uncertainty itself holds value, making the AI's self-reflection useful information, not just fluff.

Author

Trend Wire News Desk

The Trend Wire News Desk delivers clear, factual reporting across crime, weather, transport, and national developments. Our focus is on accuracy, clarity, and context, helping readers stay informed as stories unfold.

Join the Trend Wire Media community. Follow us on X, Facebook and LinkedIn for breaking news, or see the latest viral trends on YouTube, TikTok and Instagram.