Chapter 10

### Introduction

In discussions of whether trials were large enough, you may have heard people refer to the power of the trial as the authors presented in their sample size calculations. Such discussions are complex and confusing. As we illustrate in this chapter, whether a trial or meta-analysis is large enough depends only on the confidence interval (CI).

Hypothesis testing, on which sample size calculations are typically based, involves estimating the probability that observed results would have occurred by chance if a null hypothesis, which states that there is no difference between a treatment condition and a control condition, were true. Health researchers and medical educators have increasingly recognized the limitations of hypothesis testing1-5; consequently, an alternative approach, estimation, is becoming more popular.

### How Should We Treat Patients with Heart Failure? A Problem in Interpreting Study Results

In a blinded randomized clinical trial of 804 men with heart failure, investigators compared treatment with enalapril (an angiotensin-converting enzyme [ACE] inhibitor) to treatment with a combination of hydralazine and nitrates.6 In the follow-up period, which ranged from 6 months to 5.7 years, 132 of 403 patients (33%) assigned to receive enalapril died, as did 153 of 401 patients (38%) assigned to receive hydralazine and nitrates. The P value associated with the difference in mortality is .11.

Looking at this study as an exercise in hypothesis testing and adopting the usual 5% risk of obtaining a false-positive result, we would conclude that chance remains a plausible explanation for the apparent differences between groups. We would classify this as a negative study (ie, we would conclude that no important difference existed between the treatment and control groups).

The investigators also conducted an additional analysis that compared the time pattern of the deaths occurring in both groups. This survival analysis, which generally is more sensitive than the test of the difference in proportions (see Chapter 9, Does Treatment Lower Risk? Understanding the Results), had a nonsignificant P value of .08, a result that leads to the same conclusion as the simpler analysis that focused on relative proportions at the end of the study. The authors also tell us that the P value associated with differences in mortality at 2 years (a point predetermined to be a major end point of the trial) was significant at .016.

At this point, one might excuse clinicians who feel a little confused. Ask yourself, is this a positive trial, dictating use of an ACE inhibitor instead of the combination of hydralazine and nitrates, or is it a negative study, showing no difference between the 2 regimens and leaving the choice of drugs open?

### Solving the Problem: What are Confidence Intervals?

How can clinicians deal with the limitations of hypothesis testing and resolve the confusion? The solution involves posing 2 questions: (1) “What is the single value most likely to ...

Sign in to your MyAccess profile while you are actively authenticated on this site via your institution (you will be able to verify this by looking at the top right corner of the screen - if you see your institution's name, you are authenticated). Once logged in to your MyAccess profile, you will be able to access your institution's subscription for 90 days from any location. You must be logged in while authenticated at least once every 90 days to maintain this remote access.

Ok

## Subscription Options

### JAMAevidence Full Site: One-Year Subscription

Connect to the full suite of JAMAevidence content and resources including interactive self-assessment, videos, and more.