## Tuesday, January 14, 2003

### Statistical Injustice

Brad DeLong has an ongoing series of One Hundred Interesting Mathematical Computations on his website. Two of the most recent jumped out at me.

First, in OHIMC #10, he describes the letter-switching paradox that I mentioned in a previous post. And unlike the website I cited that time for explanation (because I was too lazy to describe the whole problem and its solution), he gets the moral of the story right.

Second, in OHIMC #9, he describes the false-positive problem that arises in the context of tests for disease. The bottom line is that even if you test positive for a disease, there is actually a surprisingly high chance that you don't have it. The reason, in a nutshell, is that for most diseases - even ones considered epidemics - the vast majority of the population does not have it. As a result, even a small false-positive rate will result in a very large absolute number of people testing positive even though they are disease-free. Even if the test correctly identifies all people who have the disease, they are likely to be a small group compared to those who were false positives. (If this doesn't make sense, go to DeLong's site and follow the math.)

What DeLong doesn't mention is that the false-positive problem is most acute when someone takes a test for a disease without having any particular reason to think he might have it. If there is some a priori reason for thinking you have the disease (e.g., the test is for HIV, and you've been having lots of promiscuous unprotected sex and sharing needles), then the problem is not as severe, though it still exists. I think this is especially relevant in the context of proposals to, for instance, test all school athletes for steroids or drug use. Quite aside from the legal and ethical problems related to invasion of privacy, policies like these will almost assuredly punish far more innocent students than guilty ones. Taking DeLong's sample numbers, but replacing "disease" with "drug use," testing 10,000 athletes would lead to the punishment of 49 drug-users and 199 non-drug users. Of course, DeLong's numbers are only hypothetical; if a larger percentage of the whole athlete population used drugs than in the example, the disparity would not be as severe. But the qualitative result will be the same: lots of innocents getting punished.

Thus, the notion of "probable cause" has statistical as well as legal value. Restricting drug and steroid tests to those students who have evinced other signs of use would at least reduce, though not eliminate, the disparity in punishments meted out. Across-the-board testing is not just a procedural injustice; it also leads to greater substantive injustice.