# Can one third of the statistically significant results be mistaken?

Posted December 12, 2014
This news or article is intended for readers with certain scientific or professional knowledge in the field.

“If you use P=0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time,” David Colquhoun working at the University College London claims. Well-known British pharmacologist argues that unreflected over-reliance on p-values produces too many false claims in peer-reviewed scientific journals.

Image credit: UCL Institute of Education via Flickr, CC BY-NC 2.0.

He thinks that this issue strongly contributes to deepening crisis of contemporary science. Large number of studies is not replicated successfully. Systematic failures to repeat observations suggest that something is wrong with the use of inferential procedures.

“You make a fool of yourself if you declare that you have discovered something, when all you are observing is random chance. From this point of view, what matters is the probability that, when you find that a result is ‘statistically significant’, there is actually a real effect. If you find a ‘significant’ result when there is nothing but chance at play, your result is a false positive, and the chance of getting a false positive is often alarmingly high,” Colquhom explains.

He calculated that if 100 out of 1000 identical studies will reveal significant effect, when required statistical significance threshold is 0.05 and power of the study is 0.8, then 36% of these studies will be mistaken. This disturbing result is obtained when all assumptions of statistical tests, such as normality of distribution, are met. This suggests that “any real experiment can only be less perfect than the simulations discussed here, and the possibility of making a fool of yourself by claiming falsely to have made a discovery can only be even greater than we find in this paper.”

However, author of this study admits that mathematical calculations can be debatable. “An easy way to test these problems is to simulate a series of tests to see what happens in the long run. This is easy to do and a script is supplied, in the R language. This makes it quick to simulate 100 000 t-tests (that takes about 3.5 min on my laptop). It is convincing because it mimics real life,” he says.

Interestingly, results of these simulations only confirmed that the proportion of false positives will be 36%. This means, that likelihood to report an effect, when there is none is very high. How this problem can be solved? “If you want to avoid making a fool of yourself very often, do not regard anything greater than p < 0.001 as a demonstration that you have discovered something,” the famous scholar offers.

Article: Colquhoun D., 2014, An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society open science, 1: 140216., source link.

## Developments

Technology Org App

83,374 science & technology articles

## Most Popular Articles

1. Bright Fireball Explodes Over Ontario, Meteorite Fragments Might Have Reached the Ground (August 5, 2019)
2. Why older people smell the way they do? Japanese have even a special word for it (August 4, 2019)
3. Terraforming the Surface of Mars with Silica Aerogel? (July 23, 2019)
4. Moisturizers May Be Turning Your Skin Into Swiss Cheese (4 days old)
5. Swarm Autonomy Tested in Second Major DARPA OFFSET Field Experiment (August 8, 2019)