This JAMA Guide to Statistics and Methods reviews the use of logistic regression model diagnostics to determine how well a model predicts outcomes.
In 2016, Zemek et al1 published a study that used logistic regression to develop a clinical risk score for identifying which pediatric patients with concussion will experience prolonged postconcussion symptoms (PPCS). The authors prospectively recorded the initial values of 46 potential predictor variables, or risk factors—selected based on expert opinion and previous research—in a cohort of patients and then followed those patients to determine who developed the primary outcome of PPCS. In the first part of the study, the authors created a logistic regression model to estimate the probability of PPCS using a subset of the variables; in the second part of the study, a separate set of data was used to assess the validity of the model, with the degree of success quantified using regression model diagnostics. The rationale for using logistic regression to develop predictive models was summarized in another JAMA Guide to Statistics and Methods.2 (See the chapter, Logistic Regression: Relating Patient Characteristics to Outcomes.) In this chapter, we discuss how well a model performs once it is defined.
Why Are Logistic Regression Model Diagnostics Used?
Logistic regression models are often created with the goal of predicting the outcomes of future patients based on each patient's predictor variables.2 Regression model diagnostics measure how well models describe the underlying relationships between predictors and patient outcomes existing within the data, either the data on which the model was built or data from a different population.
The accuracy of a logistic regression model is mainly judged by considering discrimination and calibration. Discrimination is the ability of the model to correctly assign a higher risk of an outcome to the patients who are truly at higher risk (ie, “ordering them” correctly), whereas calibration is the ability of the model to assign the correct average absolute level of risk (ie, accurately estimate the probability of the outcome for a patient or group of patients). Regression model diagnostics are used to quantify model discrimination and calibration.
Description of the Method
The model developed by Zemek et al discriminates well if it consistently estimates a higher probability of PPCS in patients who develop PPCS vs those who do not; this can be assessed using a receiver operating characteristic (ROC) curve. An ROC curve is a plot of the sensitivity of a model (the vertical axis) vs 1 minus the specificity (the horizontal axis) for all possible cutoffs that might be used to separate patients predicted to have PPCS compared with patients who will not have PPCS (Figure 4).1 Given any 2 random patients, one with PPCS and one without PPCS, the probability that the model will correctly rank ...