++
For every treatment, there is a true, underlying effect that any individual experiment can only estimate (see Chapter 6, Why Study Results Mislead: Bias and Random Error). Investigators use statistical methods to advance their understanding of this true effect. This chapter explores the logic underlying one approach to statistical inquiry: hypothesis testing. Readers interested in how to teach the concepts reviewed in this chapter to clinical learners may be interested in an interactive script we have developed for this purpose.1
++
The hypothesis-testing approach to statistical exploration is to begin with what is called a null hypothesis and try to disprove that hypothesis. Typically, the null hypothesis states that there is no difference between the interventions being compared. To start our discussion, we will focus on dichotomous (yes/no) outcomes, such as dead or alive or hospitalized or not hospitalized.
++
For instance, in a comparison of vasodilator treatment in 804 men with heart failure, investigators compared the proportion of enalapril-treated patients who died with the proportion of patients who received a combination of hydralazine and nitrates who died.2 We start with the assumption that the treatments are equally effective, and we adhere to this position unless the results make it untenable. We could state the null hypothesis in the vasodilator trial more formally as follows: the true difference in the proportion of patients surviving between those treated with enalapril and those treated with hydralazine and nitrates is 0.
++
In this hypothesis-testing framework, the statistical analysis addresses the question of whether the observed data are consistent with the null hypothesis. Even if the treatment truly has no positive or negative effect on the outcome (ie, the effect size is 0), the results observed will rarely agree exactly with the null hypothesis. For instance, even if a treatment has no true effect on mortality, seldom will we see exactly the same proportion of deaths in treatment and control groups. As the results diverge farther and farther from the finding of “no difference,” however, the null hypothesis that there is no true difference between the treatments becomes progressively less credible. If the difference between results of the treatment and control groups becomes large enough, we abandon belief in the null hypothesis. We further develop the underlying logic by describing the role of chance in clinical research.
++
In Chapter 6, Why Study Results Mislead: Bias and Random Error, we considered a balanced coin with which the true probability of obtaining either heads or tails in any individual coin toss is 0.5. We noted that if we tossed such a coin 10 times, we would not be surprised if we did not see exactly 5 heads and 5 tails. Occasionally, we would get results quite divergent from the 5:5 split, such as 8:2 or even 9:1. Furthermore, very infrequently, the 10 ...