Loosely quoting Pinker, "Russell M. Nelson is almost certainly a space alien. The probability that a randomly selected person on earth is the president of the LDS Church is tiny: one out of 7.8 billion, or .00000000013. Russell M. Nelson is the president of the LDS Church. Therefore, Nelson is probably not a human being." (page 128)
This is obviously awful reasoning, but why, precisely? Ironically, Pinker explains that the problem with this approach is that it is relying on traditional statistical testing with p-values, and that the way to correct for it is to use valid Bayesian reasoning. In Pinker's words:
Steven Pinker wrote:And so the convention arose that scientists should adopt a critical level that ensures that the probability of rejecting the null hypothesis when it is true is less than 5 percent: the coveted “p < .05.” (Though one might have thought that the costs of a Type II error should also be factored in, as it is in Signal Detection Theory, for some equally obscure historical reason it never was.)
That’s what “statistical significance” means: it’s a way to keep the proportion of false claims of discoveries beneath an arbitrary cap. So if you have obtained a statistically significant result at p < .05, that means you can conclude the following, right?Ninety percent of psychology professors, including 80 percent of those who teach statistics, think so. But they’re wrong, wrong, wrong, and wrong. If you’ve followed the discussion in this chapter and in chapter 5, you can see why. “Statistical significance” is a Bayesian likelihood: the probability of obtaining the data given the hypothesis (in this case, the null hypothesis). But each of those statements is a Bayesian posterior: the probability of the hypothesis given the data. That’s ultimately what we want—it’s the whole point of doing a study—but it’s not what a significance test delivers. If you remember why Irwin does not have liver disease, why private homes are not necessarily dangerous, and why the pope is not a space alien, you know that these two conditional probabilities must not be switched around. The scientist cannot use a significance test to assess whether the null hypothesis is true or false unless she also considers the prior—her best guess of the probability that the null hypothesis is true before doing the experiment. And in the mathematics of null hypothesis significance testing, a Bayesian prior is nowhere to be found.
- The probability that the null hypothesis is true is less than .05.
- The probability that there is an effect is greater than .95.
- If you rejected the null hypothesis, there is less than a .05 chance that you made the wrong decision.
- If you replicated the study, the chance that you would succeed is greater than .95.
Most social scientists are so steeped in the ritual of significance testing, starting so early in their careers, that they have forgotten its actual logic.
Pinker, Steven. Rationality (pp. 224-225)