The <i>p</i>-value: <i>p</i> for problem


  • Leonard C Marais University of KwaZulu-Natal




The p-value was introduced by Fisher as a method to perform null hypothesis testing and has since been used widely in science as an indicator of significance.1 It can be defined as a measure of strength of evidence against the null hypothesis.2 In other words, it is the probability of finding an effect at least as or more extreme than the observed findings if the null hypothesis is true. However, the p-value is unable to reliably perform this function if the statistical power is not very high. In other words, if the power of a study is low, a repeat study will likely yield a substantially different p-value. Beta errors are common in orthopaedic literature, with up to 28% of  randomised controlled trials erroneously failing to reject the null hypothesis.3 Furthermore, the arbitrary cut-off of 0.05 has led to the scientifically unsound practice of regarding so-called ‘significant findings’ as more valuable, reliable or reproducible.4 In fact, treating a p-value as a dichotomous variable is unfounded.2 More worrying is the fact that the use of p-values may have served as an incentive for the introduction of bias: a practice referred to as ‘p-hacking’. These factors have combined to create serious concerns regarding the validity of many published scientific research findings, culminating in the statement that ‘It can be proven that most claimed research findings are false’.5

Author Biography

Leonard C Marais, University of KwaZulu-Natal

PhD, Department of Orthopaedics, Grey’s Hospital, School of Clinical Medicine, University of KwaZulu-Natal; and Editor-in-Chief, South African Orthopaedic Journal







Most read articles by the same author(s)

1 2 > >>