This JAMA Guide to Statistics and Methods reviews the use of logistic regression methods to quantify associations between patient characteristics and clinical outcomes.
In an article published in JAMA, Seymour et al1 presented a new method for estimating the probability of a patient dying of sepsis using information on the patient's respiratory rate, systolic blood pressure, and altered mentation. The method used these clinical characteristics—called “predictor” or explanatory or independent variables—to estimate the likelihood of a patient having an outcome of interest, called the dependent variable. To determine the best way to use these clinical characteristics, the authors used logistic regression, a common statistical method for quantifying the relationship between patient characteristics and clinical outcomes.2
Why Is Logistic Regression Used?
One use of logistic regression is to estimate the probability that an event will occur or that a patient will have a particular outcome using information or characteristics that are thought to be related to or influence such events. Logistic regression can show which of the various factors being assessed has the strongest association with an outcome and provides a measure of the magnitude of the potential influence. It also has the ability to “adjust” for confounding factors, ie, factors that are associated with both other predictor variables and the outcome, so the measure of the influence of the predictor of interest is not distorted by the effect of the confounder.
Although logistic regression can be used to evaluate epidemiological associations that do not represent cause and effect, this chapter focuses on the use of logistic regression to create models for predicting patient outcomes. In this context, the term predictors is used to refer to the independent factors (variables) for which the influences are being quantified, and the term outcome is used for the dependent variable that the logistic regression model is trying to predict.
Description of the Method
Patient outcomes that can only have 2 values (eg, lived vs died) are called binary or dichotomous. The outcomes for groups of patients can be summarized by the fraction of patients experiencing the outcome of interest or, similarly, by the probability that any single patient experiences that outcome. However, to understand the results of a logistic regression model, it is important to understand the difference between probability and odds. The probability that an event will occur divided by the probability that it will not occur is called the odds. For example, if there is a 75% chance of survival and a 25% chance of dying, then the odds of survival is 75%:25%, or 3. Logistic regression quantitatively links one or more predictors thought to influence a particular outcome to the odds of that outcome.2
The change in the odds of an outcome—for example, the ...