Understanding uncertainty is a key to manage AI risk

2021 Suez Canal obstruction. Image source

There are a lot discussions about risk management for AI. While many are focusing on the legal/ethical risks, there is not nearly enough discussion about evaluating and managing risk caused by uncertainty in AI supported decision making.

Risk and uncertainty

Here I cite and reproduce a table from Fundamentals of Risks Management by Paul Hopkin:

Organisation	Definition
ISO	Effect of uncertainty on objectives. …
Institute of Risk Management	Risk is the combination of the probability of an event and its consequence. …
Institute of Internal Auditors	The uncertainty of an event that could have an impact on the achievement of the objectives. …

2 out of those 3 definitions start with “uncertainty” (and the rest one with “probability”). Clearly, risk raises from dealing with uncertainty or probabilistic events.

Why consider uncertainty?

A common misconception is that uncertainty is bad and should be eliminated. Well, if we have additional information to reduce the uncertainty, that’s great! But it is rather important to recognise that uncertainty exists everywhere and cannot be fully eliminated. Therefore, the key is to consider uncertainty and make risk-aware decisions.

Example 1: A company is planning its production based on AI’s sales forecasting for the next year. The AI’s prediction has been very accurate under the normal market condition. But there are several unobserved factors (for example, energy prices) which have a large impact on the consumer market. Instead of just hoping for an accurate prediction, it is more important to ask: What is the conditional predictive distribution of sales, given different scenarios (and their likelihoods) of the future energy prices?

Example 2: A company is monitoring a massive number of transactions and using AI to detect fraud in real time. The AI system has been trained on a very large set of transactions and profiles. But some transaction patterns or user profiles could still be out-of-sample and therefore difficult to classify. In this example, we want the AI system to give not only an accurate prediction, but also a good uncertainty measure (e.g., reliable error probability). When the uncertainty is high, that particular case should be handed over to human experts.

What is AI uncertainty?

On a very abstract level, an AI system solves a problem through two steps:

Learning: Given data, build a model between input and output.
Prediction: Based on the model, find a solution which minimises a loss function.

Uncertainty come from both steps, and exist in data, model, and output:

Uncertainty in data: Data are only measurements of the reality, not the truth (or only a part of the truth at the best). Sampling bias, labelling bias, and measurement noise introduce uncertainty into data. The uncertainty caused by imperfect knowledge is also called epistemic uncertainty.
Uncertainty in model: To start with, the input-output dynamic is often a stochastic process. Plus, a model is only an approximation of the real input-output dynamic. Therefore the model uncertainty reflect both nature’s stochasticity and our approximative learning (model selection, parameter estimation, etc.). This type of stochastic uncertainty is also called aleatoric uncertainty.
Uncertainty in output: Given the uncertainty in both data and model, we can hardly expect the output is a single point of truth. Instead, the output is a belief distribution of the target.

From an end-to-end perspective, an AI system’s uncertainty is generated from the data, propagated through the model, and combined at the output.

How reliable is your model’s uncertainty?

Some data scientists might say: “OK, my model has a score for its prediction. Problem solved.” The question is, is that “score” really a reliable uncertainty measure? In most of cases, the answer is no.

Providing reliable uncertainty is challenging, to say at least. Back in 1950, Glenn Brier verified the weather forecast by looking at the difference between the predictive probability and the empirical probability (Brier score) [2]. Recent researches show even start-of-the-art models failed to provide reliable uncertainty. In a ICML ‘17 paper, Guo et al. discovered that modern neural networks’ confidence outputs are unreliable and therefore need to be calibrated [3]. In a NIPS ‘19 papers, Ovadia et al. studied the quality of uncertainty under dataset shift. The result shows both accuracy and uncertainty quality degrade when dataset shift; and the calibration under i.i.d. condition does not promise better uncertainty under dataset shift [4]. Finally, in a CPVR ‘21 paper, the author concluded that most of the state-of-the-art models do not have proper uncertainty quantification [5].

For classification problems, a potential solution to leverage modern neural networks while producing reliable uncertainty is to use deep generative models. In deep generative models, neural networks are used to compress high-dimensional data (such as images) into low-dimensional latent space. Thereafter, statistical models are used to fit the data distributions in the low-dimensional space. See the famous paper Auto-encoding variational bayes by Kingma et al. for more details [6].

Conclusion

The ancient Chinese philosopher Confucius said: “Knowing how much we don’t know is the true wisdom”. Same could be said for using data and AI to make important decisions. Proper AI uncertainty evaluation help business understand and manage the risk, and ultimately contribute to the trust in AI.

To my fellow data scientists, let’s not make our models looks like a crystal ball, but rather help business to understand and manage the risk. We should discuss more about the uncertainty of predictions, the assumptions about unobserved factors, and the limited information from the data.

To my business friends, here are some suggestions:

Instead of just predictions, ask for uncertainty associated with that prediction;
Instead of just prediction accuracy, ask how good the uncertainty evaluation is;
Keep an eye on the unobserved factors which could generate significant impact on your target, ask for simulations of different scenarios.

References

[1] Hopkin, Paul. Fundamentals of risk management: understanding, evaluating and implementing effective risk management. Kogan Page Publishers, 2018.

[2] Brier, Glenn W. “Verification of forecasts expressed in terms of probability.” Monthly weather review 78.1 (1950): 1-3.

[3] Guo, Chuan, et al. “On calibration of modern neural networks.” International conference on machine learning. PMLR, 2017.

[4] Ovadia, Yaniv, et al. “Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift.” Advances in neural information processing systems 32 (2019).

[5] Valdenegro-Toro, Matias. “I find your lack of uncertainty in computer vision disturbing.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

[6] Kingma, Diederik P., and Max Welling. “Auto-encoding variational bayes.” arXiv preprint arXiv:1312.6114 (2013).