What is a P-value, and how to interpret them?

Easy Last updated on May 7, 2022, 1:31 a.m.

P-Value is a probability score, used in statistical tests to establish the significance of an observed effect. In standard definition, it can be defined as, “The probability of observing under $𝐻_0$ a sample outcome at least as extreme as the one observed is called the P-value. The smaller the P-value, the more extreme the outcome and the stronger the evidence against $𝐻_0$.” (Rohatgi, 2001)

To understand the p-value better, Let’s go over a simple example.

We are interested in investing in an ice cream company that is gaining popularity rapidly. Their flagship product is a chocolate bar with vanilla ice cream and nuts. One serving of this ice cream should contain a minimum of 60 grams of hazelnuts. Customers are complaining that the bars now have fewer nuts than the promised 60gms. To work through this issue, we will be performing a statistical significance test.

Here’s how we can do it:
Begin by taking a sample of 50 ice cream servings and find out the mean amount of hazelnuts present in those 50 servings. This will help you find the sample mean, and we can use it to make inferences about the population mean.

Is there any difference between the sample mean and population mean?

Let’s say the company produces 300,000 ice cream servings a month. Testing all of this would be an absolute nightmare and borderline not feasible! A more efficient test method would be to take advantage of the central limit theorem](https://www.datasciencepreparation.com/blog/articles/explain-central-limit-theorem-in-detail/) and randomly sample the entire population of 300,000 servings.

Now, our next step will be to create the null and alternative hypotheses:

Null hypothesis($𝐻_0$): The population mean of hazelnuts in the ice cream is 60 grams per serving. (Hint: We need to provide evidence against this)

Alternate hypothesis($E$): The population mean of hazelnuts in the ice cream is less than 60 grams per serving. (Hint: This is what we are trying to prove)

Now, $P-value$ can be defined as:
$$ P-value = P(E|H_{0})$$

Following the statistical rules, We can reject the null hypothesis if the p-value is less than the significance level (alpha $\alpha$).

If $ P-value < \alpha $ Reject Null Hypothesis $H_0$. There is a statistical difference between groups
If $ P-value > \alpha $ Fail to reject Null Hypothesis $H_0$. This means, there is no statistical difference between groups, or not enough evidence to find the difference.

What does significance level $ \alpha $ mean?

Alpha, also known as the level of significance, is the percentage of risk we are willing to take on when rejecting the null hypothesis.

How to Interpret the p-value?

P-value is the probability associated with proving the null hypothesis to be true. In real-world applications, the p-value is calculated based on sample means since the population data is massive, as mentioned in the above section! Take a look at this graph for added clarity:

Markdown Monster icon
Illustration of P-value and significance level. Source: https://www.gigacalculator.com/

In the above graph, the highlighted section in blue, is the set of p-values that can be obtained if the point (shown in green) is considered. The red arrows on the left and right-hand sides indicate the very unlikely observations.

The most important thing to note is that a p-value (shaded blue area) is the probability of an observed result arising by chance!

Note that among the true population of the ice creams, if all the ice creams had 60grams worth of hazelnuts, then it does not mean each of them had 60 grams worth of it. Some could have more than 60 grams, whilst others could have less than 60 grams, and this could eventually balance out when we consider the final population mean.

The p-value is used to determine if the outcome of this experiment is statistically significant or not. As mentioned before, low p-value means that the null hypothesis is rejected, and a high p-value means that the assumed null hypothesis is true.
In simple terms, the lower the p-value, the more surprising the evidence is. This means that our null hypothesis looks ridiculous and unlikely!

If we get a p-value of 0.03, even though the population mean of the hazelnuts is around 60 grams, say we get a sample mean of approximately 58.5 in this case. This equates to 3 out of every 100 cases where we consider rerunning the tests using the same null hypothesis.

How likely is this to reproduce? Very unlikely, right? So, it is easy to reject the null hypothesis in this case as we have a good amount of evidence to do so.

Now that we can relate to the level of significance better! The alpha value directly depends on the problem we are trying to solve, and it is the percentage of risk that we are willing to accept when rejecting the null hypothesis. It is okay to accept greater risk in the case of hazelnuts in your favorite ice cream, but would you take the more significant risk if it comes to you staying near a nuclear power plant and you want to understand the risk of a nuclear meltdown? We want to be a bit more sure, then!

Overall, the p-value is just a statistical tool used to challenge your initial belief (the null hypothesis). This is how the p-value and significance levels are used to make critical assessments in the case of hypothesis testing!

Lets’ take a look at some more examples of using p-value in a real-life statistical setting.

Example 1: If the p-value obtained is 0.0315 and the level of significance is 5%, is it possible to reject the null hypothesis?

The given p-value is 0.0315, which is less than the level of significance which is 0.05. This means that we reject the null hypothesis in this case.

Example 2: If the p-value obtained is 0.315 and the level of significance is 4%, is it possible to reject the null hypothesis?
Looking at the level of significance to be 0.04 and the obtained p-value to be 0.315, it is clear that the p-value is greater than the level of significance. Hence, we cannot reject the null hypothesis.

Let us look at another interesting example.

Example 3: If the p-value obtained is 0.06 and the level of significance is 6%, is it possible to reject the null hypothesis?

In this case, the p-value is exactly equal to the alpha value! There are two ways to think about the solution.

  • the null hypothesis is never accepted unless and until we have clear confidence levels. In other words, the experiment’s entire point is to reject the null hypothesis solely. If we do not reject the null hypothesis, we do not have sufficient data to conclude.

  • our decision is for the p-value; we would like to reject the null hypothesis. There is no standard set in stone for this. Hence, when making a decision such as this, we need to understand the entire context of the experiment. Knowing facts about the nature of the data, the quantity and quality of data, and the importance of the conclusion will help arrive at a solution.

In conclusion, the p-value is highly significant in statistics. The critical thing to note is that the p-value gives us confidence regarding the evidence presented against the null hypothesis. The value of P is a direct function of the sample size. When the sample size is significantly large, the p-value obtained will be small in many cases.

Markdown Monster icon
Good job on reading through this article!

References
1. The p value – definition and interpretation of p-values in statistics
2. Greenland, S. et al. (2016) “Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations”, European Journal of Epidemiology 31:337–350