This Guide to Statistics and Methods describes the use of target trial emulation to design an observational study so it preserves the advantages of a randomized clinical trial, points out the limitations of the method, and provides an example of its use.
Quantifying the effect of a treatment on a clinical outcome—causal inference—requires the comparison of outcomes under different courses of action. For example, to quantify the effect of tocilizumab on mortality in critically ill patients with COVID-19, the mortality risk could be compared between a group of patients administered tocilizumab and a group who are not. Ideally, eligible patients would be assigned to these groups at random. The key advantage of such a randomized trial is that both groups are expected to be comparable, and thus any differences in mortality can be attributed to tocilizumab rather than to prognostic differences between the groups.
There are additional reasons randomized trials support causal inference. In a randomized trial, the start of follow-up (time zero) for each participant is clearly specified (time of randomization), as is the assigned treatment group. This clarity regarding time zero and treatment assignment is often taken for granted when discussing the advantages of randomized trials. However, the importance of these features becomes clearer when considering failures in drawing causal conclusions from observational data.
One way to ensure that observational analyses preserve these desirable features of randomized trials is to design them so that they explicitly emulate a hypothetical randomized trial that would answer the question at hand: the target trial.1 In a study using this approach, Gupta et al2 used observational data from nearly 4000 critically ill patients with COVID-19 from 68 US hospitals to estimate the effect on mortality of tocilizumab administered within 2 days following admission to the intensive care unit (ICU).
WHAT IS TARGET TRIAL EMULATION IN THE ANALYSIS OF OBSERVATIONAL DATA?
Target trial emulation is a 2-step process. The first step is articulating the causal question in the form of the protocol of a hypothetical randomized trial that would provide the answer. The protocol must specify certain key elements that define the causal estimands (eligibility criteria, treatment strategies, treatment assignment, the start and end of follow-up, outcomes, causal contrasts) and the data analysis plan.1 The randomized trial described in the protocol becomes the target study for the causal inference of interest.
The second step is explicitly emulating the components of that protocol using the observational data: finding eligible individuals, assigning them to a treatment strategy compatible with their data, following them up from assignment (time zero) until outcome or end of follow-up, and conducting the same analysis as the corresponding target trial, except that there is adjustment for baseline confounders in an attempt to emulate random treatment assignment. Sometimes there is ambiguity regarding assignment to a treatment group. For example, ...