What are the leading methods to estimate Epistemic Uncertainty in Large Language Models?

Question

Epistemic uncertainty is uncertainty that arises from a lack of knowledge, for instance in machine learning epistemic uncertainty can be caused by a lack of training data. Estimating epistemic uncertainty is important for useful AI systems, since it allows the AI to "know that it doesn't know", therefore avoiding hallucinations.

While estimating epistemic uncertainty in machine learning classifiers has a clear interpretation, when considering generative models tasked with text generation it is less clear how to evaluate uncertainty, since many text completions can be considered satisfactory. Yet, it is obvious that a good epistemic uncertainty estimator should return a high value when a modest AI model is asked for example to "Solve the Riemann hypothesis" (hard unsolved math problem).

score 1 · Answer 1 · answered Feb 15 '25 at 07:41

In traditional Monte Carlo dropout approach, by enabling dropout at inference time and performing multiple stochastic forward passes, you can sample different outputs from the same input. The variability across these samples can serve as a proxy for epistemic uncertainty.

In LLMs recent work on chain-of-thought (CoT) prompting has suggested that if you sample multiple reasoning paths for the same input and then aggregate or measure the variance among them, the diversity can serve as an uncertainty signal. When many diverse CoT completions emerge, the model’s internal state may be less certain about the correct reasoning process. This is why self-consistency in CoT sampling is taken as an indicator of lower epistemic uncertainty. You can further read Wang et al., (2023) "Self-Consistency Improves Chain-of-Thought Reasoning in Language Models".

This suggests that one can use self-consistency to provide an uncertainty estimate of the model in its generated solutions. In other words, one can use low consistency as an indicator that the model has low confidence; i.e., self-consistency confers some ability for the model to “know when it doesn’t know”.

What are the leading methods to estimate Epistemic Uncertainty in Large Language Models?

1 Answers1