Uncertainty and risk
An overview of the concepts of uncertainty and risk, their links to other aspects of statistical literacy and communication and understanding of risk.
This guide is one in a series on different aspects of statistical literacy. The others can be found in the House of Commons Library's Good Information Toolkit.
The concepts of uncertainty and risk are fundamental to statistics and important in many aspects of people’s daily lives as well as different fields of science and economics. This article describes how uncertainty is measured, interpreted and presented. It highlights its connection to other aspects of statistical literacy and looks at the communication and understanding of risk.
What are uncertainty and risk?Uncertainty exists where there is more than one possible outcome of an event.
Risk is a way of quantifying uncertainty and gives an idea of how likely something is to happen.
For a single event occurring or not occurringIf an event, or outcome of a single event, is absolutely guaranteed to happen then there is no uncertainty about it. Equally, if there is no possibility whatsoever of an event happening there is also no uncertainty.
Some events or outcomes which are certain over one period of time are uncertain over a different period of time. For instance, it is certain that someone will eventually die, but uncertain whether they will die in the next 20 years.
Risk is a quantified uncertainty, and in statistics the term ‘risk’ applies both to positive or negative outcomes. Risk can also be called probability, odds, likelihood and so on.
For example, it is uncertain whether a set of six specific numbers will be drawn in the lottery in any given week. It may be highly unlikely that this particular combination will come up, but it is possible, so the outcome is uncertain. The risk of this (highly positive) outcome can be calculated: the chance that any one set of six numbers will be drawn from 49 in an unbiased process is 1 in 14 billion for each draw).
For quantities and frequenciesThe concept of uncertainty can be extended from single events (where there are only two possible outcomes: something happening or not happening) to quantities and frequencies (where there is a range of different possible outcomes). Examples might include voting intentions from opinion polling, estimated costs of a building project, or the number of days that it might rain next month. Here the uncertainty is usually more obvious as the quantity is clearly being estimated or the associated event or events have not yet happened.
When this uncertainty is quantified then the result is normally a maximum and minimum range or a central estimate, plus or minus an amount (or percentage) due to uncertainty.
The process for quantifying this range of uncertainty can be theoretical (for instance, in sample survey results) or from experience (for instance, from the previous highest and lowest costs of each element in a project). If the uncertainty is calculated from experience, it may not be much more than a best guess and might not be accurate, but can be improved over time.
Even if the quoted range of uncertainty does not have a ‘scientific’ basis, it at least acknowledges there is some uncertainty associated with the quantity in question and helps communicate this to its intended audience.
The concept of uncertaintyVery few events in our daily lives are truly 100% certain. This does not mean they are completely random, or that we know nothing about the risk of them happening, only that we cannot be absolutely sure that they will or will not happen. Accepting that there is some uncertainty involved is a necessary first step in thinking about the associated risk of the event happening and considering alternative possible outcomes.
In his book Reckoning with Risk – Learning to Live with Uncertainty, Psychologist Gerd Gigerenzer called the belief that an event is absolutely certain (even if it is not) the ‘illusion of certainty’. He sees this as one of the causes of innumeracy as it leaves no room for considering alternative possibilities or how likely they are and is associated with an overly simplistic view of causes and consequences.
It is understandable that people should want certainty and predictability. Economists assume that most people are risk averse and are happy to pay a premium for certainty, hence the existence of the insurance industry. In the long run, with insurance companies making profits, people will on average be worse off financially than without any insurance. However, people generally value the ‘certainty’ they gain from insurance, particularly that they will avoiding large losses for the insured risk.
Understanding the concept of uncertainty helps people appreciate that many everyday events are complex, the result of with multiple related factors, some of which may be poorly understood, and subject to some inherent randomness. The quote at the start of this section about the uncertainties in science refers to natural and physical sciences. Uncertainties in social sciences and hence our daily lives are even greater because they deal with complex human behaviour, motivations and interactions which are sometimes irrational and do not naturally lead to simple definitive conclusions or rules.
One response to uncertainty would be to say that nothing can be concluded about the issue unless you can come up with a definitive unambiguous answer with 100% certainty. This absolutist approach would rule out the analysis of uncertainty in many activities, including out all or most social science, much medicine, insurance, many criminal prosecutions, weather forecasting and so on. An approach that takes uncertainty and complexity into account may not come up with superficially appealing definitive conclusions. It aims to be ‘approximately right’. The alternative, which does not acknowledge or analyse the uncertainty of an outcome, risks being ‘exactly wrong’.
Examples where uncertainty/risk is defined Single eventsExamples of risks associated with single events are less common than uncertainties associated with quantities. The most prominent ones are for natural events.
The Environment Agency assesses the risk of flooding for England and Wales. One of its main outputs is the flood map, which estimates the annual risk of flooding from surface water or from rivers and sea for all areas. Each locality is placed into one of four risk categories for each type of flooding: high (risk in any one year is greater than 1 in 30), medium (greater than one in 100 but less than 1 in 30), low (greater than 1 in 1,000 but less than 1 in 100) or very low (less than 1 in 1,000).
Weather forecasts attempt to quantify the risk of different kinds of weather events. The Met Office produces probability forecasts and forecasts that give a confidence range for various weather events, and while most weather forecasts don’t quote these for daily reports, they are included for weather warnings. Their severe weather warning service combines estimates of the impact of the type of weather in question with estimates of how likely it is to decide on what level of warning to issue. For instance, the most severe red warning is only issued for weather which is very likely and is expect to have a high impact. More information about how the Met Office approaches uncertainty can be viewed at Using ensemble forecasts in probability and decision-making.
Quantities and frequenciesUncertainties in quantities or frequencies are much more common. The examples below show attempts to quantify the inherent uncertainty in a range of different areas. The uncertainty in these examples comes from different sources including:
- the event, or a contributing factor, is in the future
- it is subject to chance
- data on the subject is lacking
- scientific understanding of the event is imperfect
Official population projections available from the Office for National Statistics give a principal projection, plus 14 different variant projections that make different assumptions about fertility, life expectancy and migration. The principal population projection for the UK in 2047 is 76.6 million. The ‘high population’ variant, which uses the high assumptions for each of the three underlying elements, gives a projected population in 2047 of 83.9 million. The ‘low population’ variant, which uses the low assumptions for each element, gives a projected population of 68.6 million.
What can we conclude from opinion polls?Estimates from surveys also contain uncertainty, and researchers can use a statistical technique to produce confidence intervals or a margin of error for the results they produce. These may not always be especially prominent in political opinion polls, but assuming a typical sample size of 1,000 and a perfectly random sample, the margin of error is given as ±3 percentage points. In other words, if a political party received a rating of 40% from the sample of people surveyed, we would expect its national rating to be between 37% and 43% if the sample were truly random.
How much carbon dioxide and other greenhouse does the UK produce?Data on UK greenhouse gas emissions are commonly seen as definitive totals, but and despite continued development of the methodology and data sources, they are still subject to a degree of uncertainty. For example, the uncertainty in UK carbon dioxide emissions in 2023 has been estimated to be ±2.1%.
Trends are also affected, but to a lesser degree as some of the uncertainties are assumed to be correlated over time. The central estimate of the change in carbon dioxide emissions between 1990 and 2023 was –51%, the 95% confidence interval (the range within which we can be reasonably confident that the ‘true’ value lies) was -51% to -48%. The ranges of uncertainty are larger still for other greenhouse gases.
What will inflation be in the future?The Bank of England’s Monetary Policy Reports include a fan chart of projections of inflation over the next few years. These show Monetary Policy Committee’s best collective judgement of the most likely paths for inflation (given different assumptions about interest rates and assuming no change in economic circumstances). These show a relatively narrow band where inflation is expected to be 30% of the time, a wide band which extends this to 60% and the lightest band which covers 90%. The chart from the May 2025 report is reproduced below.
Source: Bank of England, Monetary Policy Report - May 2025
How much oil is left in the North Sea?In estimates of the UK’s oil and gas reserves uncertainty is conveyed in multiple ways. First into one of three main categories: ‘Reserves’ which are recoverable and commercial, ‘Contingent Resources’ are those in discovered sites which are not yet commercial and ‘Prospective Resources’ which are potentially recoverable resources in undiscovered sites. Reserves are further grouped into the following categories based on their estimate chance (risk) of being commercially and technically producible:
- Proven: >90%
- Probable: 50% to 90%
- Possible: 10 to 50% chance
So, for instance, at the end of 2023 the UK’s proven oil and gas reserves were estimated at 2.2 billion barrels of oil equivalent, probable reserves were an additional 1.1 billion barrels and possible reserves a further 0.7 billion barrels, taking total reserves (with a greater than 10% chance of being producible) to 4.0 billion barrels.
Statistical concepts and uncertaintyMuch of statistics aims to quantify exactly how likely an event is given certain underlying assumptions about the factors involved. Key to this is an understanding that some results could be the result of random variation. Significance testing is used to establish how likely it is that an observed statistical relationship is real or just appears to exist because of random chance.
More detail is given in the guide Confidence intervals and statistical significance.
Communicating and understanding riskDifferent ways of presenting and communicating the risk can alter how people perceive the same underlying data. None of the various alternatives are technically wrong, but understanding the differences between them is important for drawing conclusions about data and reasoning about what it means.
The following methods are set out by the psychologist Gerd Gigerenzer in his book Reckoning with Risk – Learning to Live with Uncertainty. He recommends communicating risks in terms of absolute number of people rather than as percentages of percentages, which helps people interpret risk more reliably.
Statistics presented as relative or absolute risks Relative risksRelative risks are often used to compare the risk of an outcome under two sets of circumstances. It is usually presented as a percentage change, which conveys how much more or less likely an outcome is to occur under one set of circumstances than the other.
In medicine, for instance, relative risks can be used to show the reduction in death rates from a disease after taking a drug. So, if 50 out of 1,000 people who did not have the drug died from the disease and 40 out of 1,000 people died who did have the drug then the relative risk reduction is 20%:
- Probability in group without the drug is 50/1000 = 0.05 (or 5%)
- Probability in group with the drug is 40/1000 = 0.04 (or 4%)
- (0.05-0.04)/0.05 = 0.20 or 20%
An alternative to relative risks, recommended by Gigerenzer, is absolute risks. In this example 10 fewer people died out of the 1,000 who received the drug than in the 1,000 who did not have the drug. The absolute risk reduction is 10 divided by 1,000 = 1%.
Absolute risks help inform people about the importance of the change in risk, here by showing how many lives might be saved by giving this drug to 1,000 people. They do this by taking the base mortality rate (here 5%) into account and incorporating an answer to the question “20% of what”.
The same relative risk reduction would look very different, in terms of numbers of people affected, with a much higher base mortality rate. The absolute risk reduction reflects this because it takes the base rate into account. For example, if instead 900 out of 1,000 people died who did not have the drug (a base mortality rate of 90%) and 720 of 1,000 died who did have the drug, the relative risk reduction would still be 20%. However, the number of lives saved per 1,000 people given the drug increases dramatically to 180, an absolute risk reduction of 18%, because of the much higher base mortality rate.
If a reader is only presented the relative risk reduction, they won’t be able tell how important this drug is for the population in question without knowing the underlying incidence of a disease or the base mortality rate. This makes comparisons with other treatments or diseases difficult as it lacks context. The absolute risk reduction figure does not have this problem because the base rate (which is the mortality rate of 5%) is included in the statistic. It tells the reader that if all the sufferers of this particular condition received this drug then the death rate from it would fall by 1 in every 100 sufferers.
However, it is uncommon for all this data to be presented clearly in the material which reaches the public, either in the press or in marketing from pharmaceutical companies. What reaches the public is normally “this drug reduced deaths from this disease by 20%”.
Clearly, with the choice of an absolute or relative risk reduction a pharmaceutical company that wanted to emphasise a larger change would choose a relative risk reduction which is never smaller (they are only equal when in the group with no treatment 100% died). What is missing in the figure above is “20% of what?”.
The difference between absolute and relative risk reductions is analogous to the difference between percentages expressed in percentage and percentage point terms (see the statistical literacy guide on percentages for more information). In the above example the change is either a 20% reduction in the mortality rate or a 1 percentage point reduction.
Conditional probabilitiesConditional probabilities are the probability or risk that event A will occur given that B has already occurred. It is very easy to mix up the probability of A given B with the probability of B given A, which can lead people to underestimate or overestimate risks. As above for relative risk and absolute risk, it’s therefore often easier and clearer to communicate these probabilities with natural frequencies.
For instance, what was the risk that someone had covid-19 if they had a positive lateral flow test result?
To answer this question we need to know three things:
- the prevalence of covid-19: the base probability of someone in the general population having covid-19
- the probability of a true positive: this is a type of conditional probability, the probability that a test will correctly show positive given that a person does have covid-19
- the probability of a false positive: this is a type of conditional probability, the probability that a test will wrongly show positive given that a person does not have covid-19
If a person has a positive test, then it might be a true positive or a false positive.
However, the probability of a positive test result being true or false depends on the prevalence in the population. For example, if the prevalence is 0%, then a positive test result can only be a false positive and if the prevalence is 100% then a positive test result can only be a true positive; if the prevalence is somewhere between these extremes, then the probabilities of a positive test result being true or false also change.
This means the probability of someone having covid-19 given they have a positive lateral flow test result (the positive predictive value) is:
the probability of having Covid-19 x the probability of a true positive
divided by
(the probability of having Covid-19 x the probability of a true positive)
+ (the probability of not having Covid-19 x the probability of a false positive)
So if prevalence is 1%, the probability of a true positive is 77% and the probability of a false positive is 0.3%, then a person’s probability of having covid-19 given a positive test is:
0.01 x 0.77 / ((0.01 x 0.77) + (0.99 x 0.003)) = 0.72 or 72%
If the underlying prevalence was higher at 5% then the probability of someone having covid-19, given they have a positive lateral flow test result, would also be higher at be 93%.
This rate is different from the probability of a person having a positive test given they have covid-19 (the true positive rate), which is 77%.
Sensitivity and specificityConditional probabilities connected to screening tests in medicine are frequently termed ‘sensitivity’ and ‘specificity’.
- Sensitivity is the proportion of people who have the disease who received a positive screening test result (77% in the example above).
- Specificity is the proportion of people who did not have the disease and received a negative screening result (99.7%) in the example above.
- Sensitivity is the true positive rate, specificity the true negative rate.
- False positives (23% in this example) add up to 100% with specificity.
- False negatives (0.3% above) add up to 100% with sensitivity.
For a given test there is normally a trade-off between sensitivity and specificity.
Natural frequenciesPsychologist Gerd Gigerenzer says that many people, including professionals in the relevant field, confuse the risk of A given B with the risk of B given A, or the risk of A and B occurring independently. Instead, he suggests replacing probabilities with natural frequencies. This is done by illustrating all the underlying data in terms of so many people per 100, or 1,000, 100,000 and so on. While this may seem identical to using percentages, it means keeping in the same base quantity and avoids taking percentages of percentages (as in conditional probabilities).
Because the numbers involved are people, it can be easier to attach relevant categories to them and ensure that the totals match. There is also less calculation involved for someone who is looking at the data, as part of it has already been done in the presentation of the numbers.
Converting the first example (on conditional probabilities) above into natural frequencies gives: 1,000 out of every 100,000 people tested have covid-19. 770 of these 1,000 will have had a positive test and 297 of the remaining 99,000 without covid-19 would still have had a positive test. To work out the proportion of people who have covid-19 given a positive test, we take the number of true positives divided by the total of all positives (770/(770+297), which gives 72%, as i the earlier example.