Understanding the Epistemic Uncertainty in Deep Learning

Prakash Verma
Heartbeat
Published in
7 min readApr 23, 2023

--

Photo by RetroSupply on Unsplash

Introduction

Deep learning has been widely used in various fields, such as computer vision, NLP, and robotics. The success of deep learning is largely due to its ability to learn complex representations from data using deep neural networks. However, despite its success, deep learning still suffers from several challenges, including the issue of epistemic uncertainty.

In this article, we will provide an overview of epistemic uncertainty in deep learning.

What is Epistemic Uncertainty?

Epistemic uncertainty is a type of uncertainty arising from a lack of knowledge about the model’s parameters and architecture. It is a form of uncertainty that can be reduced or eliminated with additional data, improved models, and a better understanding of the underlying system.

Epistemic uncertainty is often contrasted with aleatoric uncertainty, which is uncertainty that arises from the inherent variability of the data itself, such as measurement noise or natural variability in the data. Understanding and managing epistemic uncertainty is essential to building reliable and robust deep learning models.

Image From: https://www.researchgate.net

Sources of Epistemic Uncertainty

Understanding the sources of epistemic uncertainty is important for developing more robust and reliable machine learning systems. By identifying the sources of uncertainty, we can develop techniques to mitigate or reduce the uncertainty, leading to more accurate and trustworthy predictions.

Epistemic uncertainty in deep learning arises from various sources. Here are some of the most common sources of epistemic uncertainty:

  1. Limited training data: Epistemic uncertainty arises when the neural network is not certain about the mapping between the input and output data. When the training data is limited, the neural network may not have enough examples to learn the underlying distribution of the data, leading to epistemic uncertainty.
  2. Ambiguity in the input data: Epistemic uncertainty can arise when the input data is ambiguous or noisy. For example, in a medical diagnosis task, the neural network may be uncertain about the diagnosis if the input image is of poor quality or the patient’s symptoms are not clear.
  3. Model architecture: The choice of model architecture can also contribute to epistemic uncertainty. A model that is too simple may not capture the complexity of the data, while a model that is too complex may overfit the training data and not generalize well to new data.
  4. Model initialization: The choice of initial values for the model parameters can also contribute to epistemic uncertainty. Different initializations can lead to different solutions, and the neural network may be uncertain about which solution is correct.
  5. Stochasticity in the learning process: The use of stochastic optimization algorithms such as SGD can introduce stochasticity in the learning process, leading to epistemic uncertainty. Different runs of the optimization algorithm with the same initial values can lead to different solutions, and the neural network may be uncertain about which solution is correct.
Image from: https://www.researchgate.net

How to Address Uncertainty

Epistemic uncertainty, also known as model uncertainty, arises from the fact that the neural network is not certain about the mapping between the input and output data. There are several methods that can be used to estimate epistemic uncertainty in deep learning, and the choice of method will depend on the specific application and the available data.

1. Bayesian deep learning

Bayesian deep learning is a powerful approach for estimating epistemic uncertainty in deep learning. Here are the general steps for using Bayesian deep learning to estimate epistemic uncertainty:

  1. Define a prior distribution over the model parameters. This prior distribution represents our prior beliefs about the model parameters before seeing the data.
  2. Train the neural network using Bayesian inference. Instead of updating a single point estimate of the model parameters, we update the posterior distribution over the model parameters using Bayes’ rule.
  3. Use the posterior distribution over the model parameters to make predictions on new data. We can sample from the posterior distribution to generate a distribution of predictions, which represents the epistemic uncertainty.

How does the team at Uber manage to keep their data organized and their team united? Comet’s experiment tracking. Learn more from Uber’s Olcay Cirit.

2. Dropout

Dropout is a regularization technique that can be used to estimate epistemic uncertainty. By using dropout during inference, we can generate an ensemble of predictions, each with a slightly different network architecture due to the dropout. The variance in the predictions can be used to estimate the epistemic uncertainty.

Image from: https://imerit.net

Dropout is an efficient method that has been used frequently to address overfitting issues in deep neural networks. In order to prevent neural network units from overly co-tuning during the training process, dropout randomly removes some of them during a forward pass through the network at a given chance.

In order to avoid interfering with the forward pass on a new data point, these dropout layers are typically deactivated after training. When classifying an image with dropout, you would perform the prediction several times and examine the various outputs produced by the various forward passes to determine the model’s uncertainty.

3. Deep ensembles

Deep ensembles are another approach to estimating epistemic uncertainty in deep learning. By using deep ensembles, we can estimate the epistemic uncertainty associated with the model parameters, which can help to develop more robust and reliable machine learning systems. Here are the general steps for using deep ensembles to estimate epistemic uncertainty:

  1. Define multiple neural networks with different architectures or initializations. The different neural networks are trained independently on the same dataset.
  2. Use the trained neural networks to make predictions on new data. Each neural network produces its own set of predictions.
  3. Combine the predictions from the multiple neural networks to form a distribution of predictions. For example, we can use the mean or median of the predictions as the point estimate, and the variance or standard deviation of the predictions as a measure of epistemic uncertainty.

Implications of Epistemic Uncertainty

Epistemic uncertainty can have several implications, including —

  1. Decreased model performance: Epistemic uncertainty can lead to decreased model performance because the model may not be able to make accurate predictions on new data. If the neural network is uncertain about the mapping between the input and output data, it may produce less accurate predictions, leading to decreased model performance.
  2. Increased risk: Epistemic uncertainty can increase the risk associated with using machine learning systems. If the model is uncertain about its predictions, it may make incorrect decisions that could have serious consequences. For example, in a medical diagnosis task, if the model is uncertain about the diagnosis, it may recommend the wrong treatment, leading to adverse health outcomes.
  3. Reduced interpretability: Epistemic uncertainty can reduce the interpretability of the model. If the model is uncertain about its predictions, it may be difficult to understand why the model is making certain decisions. This can make it challenging to explain the model’s predictions to stakeholders and gain their trust.
  4. Bias and unfairness: Epistemic uncertainty can lead to bias and unfairness in machine learning systems. If the model is uncertain about certain inputs or certain groups of individuals, it may produce biased predictions that unfairly disadvantage certain individuals or groups.
  5. Increased computational cost: Bayesian deep learning methods that explicitly model epistemic uncertainty can be computationally expensive compared to standard deep learning methods. This can increase the cost of developing and deploying machine learning systems.

How to Mitigate Epistemic Uncertainty

Mitigating epistemic uncertainty in deep learning can improve the reliability and robustness of machine learning systems. Here are some approaches to mitigating epistemic uncertainty:

  1. Increase the size and diversity of the training data: Epistemic uncertainty arises when the neural network is not certain about the mapping between the input and output data. By increasing the size and diversity of the training data, we can reduce epistemic uncertainty by providing the neural network with more examples of the underlying distribution of the data.
  2. Regularization: Regularization techniques such as dropout, weight decay, and early stopping can help to mitigate epistemic uncertainty by preventing overfitting and encouraging the neural network to learn more robust representations of the data.
  3. Model selection: Choosing a model architecture that is well-suited for the task at hand can help to reduce epistemic uncertainty. For example, using a convolutional neural network for image classification tasks can reduce epistemic uncertainty compared to using a fully connected neural network.

Conclusion

Uncertainty will always exist in any machine learning problems. In addition to the training data and real-world data, it is associated with the ML model and any biases incorporated by humans into the algorithm. Although some level of uncertainty will always exist, there are numerous techniques to identify, calculate, and reduce it.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Technical Writer and Developer having 13 years of work experience, My Primary Skill includes: Data Analyst, AI/ML, Deep Learning, Python, PySpark, AWS-Cloud,