Everyone who takes an undergrad stats class learns how to perform statistical significance tests. They learn to choose a significance level of 95%, corresponding to an alpha of 0.05. Sometimes, they learn that you can select a higher or lower level of significance, like 99% (alpha = 0.01) or 90% (alpha = 0.1). But what I’ve gradually realized, from speaking to (and testing) many undergrads, is that they typically have

*no clue*why – and more importantly, when – that’s the right level of significance to choose. And I’m increasingly of the opinion that lots of professionals don’t, either. (Maybe I’m one of the ignorant professionals – I’ll judge that from the reactions to this post. Also, nothing I say here is meant as criticism of Jacob, who addresses a related but different question.)

The fact is that alpha = 0.05 is essentially

*arbitrary*. Technically, alpha is the probability that your testing method will lead you to

*incorrectly reject*some “null” hypothesis. The null is the complement (logical opposite) of the “alternative” hypothesis, which is the claim you’re interested in supporting. To take the minimum wage example, the null hypothesis is that there’s

*no*relationship between the minimum wage and employment. With alpha = 0.05, there’s a 5% chance you’ll incorrectly reject that hypothesis and conclude there is such a relationship (when in fact there is not).

But why should alpha be so small? Why put such high value on not incorrectly accepting our alternative hypothesis? The idea is that, as scientists, we ought not put our faith in a conclusion unless we have very strong proof. And, again as scientists, we must be satisfied to

*remain agnostic*if we fail to get statistical significance for a proposition. And this is the key point:

*The absence of statistical significance should not lead us to accept the null hypothesis. It should lead us to be agnostic about both the null and the alternative hypothesis.*To take the minimum wage example again, if studies fail to show the minimum wage causes unemployment, the appropriate conclusion is not that there isn’t a relationship, but that

*we just can’t say so*with much confidence.

Think about it this way. Above, I supposed we were interested in showing that there

*is*a relationship between the minimum wage and unemployment. In order not to make the task too easy on ourselves, we set a rather high bar: 95% confidence. But what if we were interested in showing there’s

*not*a relationship? In that case, we are interested in supporting the null, not the alternative, hypothesis. If we set alpha = 0.05,

*and if we accept the null whenever we fail to accept the alternative*, then what is the chance of incorrectly affirming that there’s no relationship? It is not 5%, but in fact something much larger – what statisticians call the beta value, corresponding to a Type II error (the error of incorrectly failing to reject the null hypothesis). The smaller is the alpha, the larger is the beta. And that means using an alpha of 0.05 makes it way, way too easy to claim to have proven the no-relationship hypothesis.

Using a small alpha makes a lot of sense if you’re choosing between belief and agnosticism, and you wish to give agnosticism the benefit of a doubt. Scientists don’t want to express support for something unless they’re pretty darn sure of it. But what if the choice is not between belief and agnosticism, but between one belief and another belief? In practical decision-making, that is usually the case. The owner of a movie theater has to decide whether students’ ticket-buying behavior differs from the rest of the public’s, and if he makes the wrong decision he will not make as much money as he could have. He has no choice but to pick a belief – either he thinks students are probably different and he charges different prices, or he thinks they are probably the same and he charges the same prices. Similarly, a government can either impose a minimum wage or fail to do so. It can’t remain purely agnostic like the scientist can.

In cases like these, the arbitrary setting of a very small alpha doesn’t make sense, because both the alpha and the beta are important. Small alpha implies large beta. In the case of the minimum wage, a large beta means a high chance of assuming there’s no relationship between the minimum wage and employment even though there is.

Again, let me emphasize that I’m not trying to make a point about the minimum wage

*per se*. The point I’m making here applies to business, public policy, and numerous other cases of practical decision-making in which one must choose between alternate strategies based on alternate beliefs about the world. The decision rules of pure science should not be confused with the decision rules of life.

## No comments:

Post a Comment