# Chapter 10: Confidence Intervals: Was the Single Study or Meta-analysis Large Enough?

In discussions of whether trials were large enough, you may have heard people refer to the power of the trial as the authors presented in their sample size calculations. Such discussions are complex and confusing. As we illustrate in this chapter, whether a trial or *meta-analysis* is large enough depends only on the *confidence interval* (CI).

Hypothesis testing, on which sample size calculations are typically based, involves estimating the probability that observed results would have occurred by chance if a *null hypothesis*, which states that there is no difference between a treatment condition and a control condition, were true. Health researchers and medical educators have increasingly recognized the limitations of hypothesis testing^{1-5}; consequently, an alternative approach, estimation, is becoming more popular.

In a *blinded randomized clinical trial* of 804 men with heart failure, investigators compared treatment with enalapril (an angiotensin-converting enzyme [ACE] inhibitor) to treatment with a combination of hydralazine and nitrates.^{6} In the *follow-up* period, which ranged from 6 months to 5.7 years, 132 of 403 patients (33%) assigned to receive enalapril died, as did 153 of 401 patients (38%) assigned to receive hydralazine and nitrates. The *P* value associated with the difference in mortality is .11.

Looking at this study as an exercise in hypothesis testing and adopting the usual 5% risk of obtaining a *false-positive* result, we would conclude that chance remains a plausible explanation for the apparent differences between groups. We would classify this as a *negative study* (ie, we would conclude that no important difference existed between the treatment and *control groups*).

The investigators also conducted an additional analysis that compared the time pattern of the deaths occurring in both groups. This *survival analysis*, which generally is more sensitive than the test of the difference in proportions (see Chapter 9, Does Treatment Lower Risk? Understanding the Results), had a nonsignificant *P* value of .08, a result that leads to the same conclusion as the simpler analysis that focused on relative proportions at the end of the study. The authors also tell us that the *P* value associated with differences in mortality at 2 years (a point predetermined to be a major *end point* of the trial) was significant at .016.

At this point, one might excuse clinicians who feel a little confused. Ask yourself, is this a *positive trial,* dictating use of an ACE inhibitor instead of the combination of hydralazine and nitrates, or is it a negative study, showing no difference between the 2 regimens and leaving the choice of drugs open?

How can clinicians deal with the limitations of hypothesis testing and resolve the confusion? The solution involves posing 2 questions: (1) “What is the single value most likely to ...

**Log In to View More**

If you don't have a subscription, please view our individual subscription options below to find out how you can gain access to this content.

### MyAccess Sign In

**
Want remote access to your institution's subscription?**

Sign in to your MyAccess profile while you are actively authenticated on this site via your institution (you will be able to verify this by looking at the top right corner of the screen - if you see your institution's name, you are authenticated). Once logged in to your MyAccess profile, you will be able to access your institution's subscription for 90 days from any location. You must be logged in while authenticated at least once every 90 days to maintain this remote access.

Ok### About MyAccess

If your institution subscribes to this resource, and you don't have a MyAccess profile, please contact your library's reference desk for information on how to gain access to this resource from off-campus.

Ok## Subscription Options

### JAMAevidence Full Site: One-Year Subscription

Connect to the full suite of JAMAevidence content and resources including interactive self-assessment, videos, and more.