Did the Study Patients Represent the Full Spectrum of Those With This Clinical Problem?
The patients in a study are identified or sampled from an underlying target population of people who seek care for the clinical problem being investigated. Ideally, this sample mirrors the target population in all important ways so that the frequency of underlying diseases found in the sample reflects the frequency in the population. A patient sample that mirrors the target population well is termed “representative.” The more representative the sample, the more accurate the resulting disease probabilities. As shown in Box 17-2, we suggest 4 ways to examine how well the study patients represent the entire target population.
Ensuring a Representative Patient Sample
|Favorite Table|Download (.pdf) BOX 17-2
Ensuring a Representative Patient Sample
Did the investigators define the clinical problem clearly?
Were study patients collected from all relevant clinical settings?
Were study patients recruited consecutively from the clinical settings?
Did the study patients exhibit the full clinical spectrum of this presenting problem?
First, because this determines the target population from which the study patients should be drawn, find the investigators' definition of the presenting clinical problem. For instance, for a study of chest discomfort, you would want to find whether the investigators' definition included patients with chest discomfort who deny pain (like many patients with angina do), whether “chest” means discomfort only in the anterior thorax (vs also posterior), and whether patients with obvious recent trauma are excluded. In addition, investigators may specify the level of care or amount of previous evaluation (eg, “fatigue in primary care”6 or “referred for persistent unexplained cough”7). Differing definitions would define differing target populations that would yield differing disease probabilities. A detailed, specific definition of the clinical problem makes it more likely you will be able to confidently judge whether the study population matches the patient before you.
Second, examine the settings from which patients are recruited. Patients with the same clinical problem could present to any of a number of different clinical settings, whether primary care offices, emergency departments, or referral clinics. The choice of where to seek care can involve several factors, including the duration and severity of illness, the availability of various settings, the referral habits of one's clinician, or patient preferences. Different clinical settings are likely to manage patient groups with different disease frequencies. Typically, the frequency of more serious or less common diseases will be greater in secondary or tertiary care settings than in primary care settings. For instance, in a study of patients presenting with chest pain, a higher proportion of referral practice patients had coronary artery disease than did primary care practice patients, even among patients with similar clinical histories.8
Investigators should avoid restricting recruitment to idiosyncratic settings that are likely to treat an unrepresentative patient sample. For instance, for the “fatigue in primary care” problem, although only primary care settings would be relevant, the investigators would ideally recruit from a broad spectrum of primary care settings (eg, those serving patients of varying socioeconomic status). In general, the fewer the relevant sites used for patient recruitment, the greater the risk that the setting will be idiosyncratic or unrepresentative.
Third, note the investigators' methods for identifying patients at each site and how carefully they avoided missing patients. Ideally, they would recruit a consecutive sample of all patients who seek care at the study sites for the clinical problem during a specified period. If patients are not included consecutively, selective inclusion could reduce the representativeness of the sample and reduce confidence in the resulting disease probabilities.
Fourth, examine the spectrum of severity and clinical features exhibited by the patients in the study sample. Are mild, moderate, and severely symptomatic patients included? Are all of the important variations of this presenting clinical problem found in the sample? For instance, for a study of chest discomfort, you would want to determine whether patients with chest discomfort of any degree of severity were included and whether patients were included whether they did or did not have important associated symptoms, such as dyspnea, diaphoresis, or pain radiation. The fuller the clinical spectrum of patients in the sample is, the more representative the sample should be of the target population. Conversely, the narrower the clinical spectrum is, the less representative you would rate the sample. In particular, are patients similar to the one before you well represented?
USING THE GUIDE
Hernandez et al2 defined the clinical problem for their study as “isolated involuntary weight loss,” meaning a verified, unintentional loss of more than 5% of body weight during 6 months without localizing signs or symptoms and with no diagnosis made on initial testing. From January 1991 through December 1996, 1211 patients were referred consecutively from a defined geographic area to their general internal medicine outpatient and inpatient settings for involuntary weight loss, of whom 306 met their definition of “isolated.” Men and women were included, and ages ranged from 15 to 97 years. The authors did not describe the patients' ethnic background or socioeconomic status. The investigators excluded patients if they lost less than 5 kg, if they had a previous diagnosis that could explain involuntary weight loss, if the initial evaluation identified the cause, or if weight loss was intentional. Thus, their study sample represents fairly well the target population of patients who are referred for the evaluation of involuntary weight loss. Your patient would have been included in their sample.
Was the Diagnostic Evaluation Definitive?
Articles about disease probability for differential diagnosis will provide valid evidence only if the investigators arrive at correct final diagnoses for the study patients. To judge the accuracy of the final diagnoses, you should examine the diagnostic evaluation undertaken. The more definitive this evaluation, the more likely that the frequencies of the diagnoses made are accurate estimates of the disease frequencies in the target population. In Box 17-3, we suggest 6 ways to examine the question, “How definitive is the diagnostic evaluation?”
Ensuring a Definitive Diagnostic Evaluation
|Favorite Table|Download (.pdf) BOX 17-3
Ensuring a Definitive Diagnostic Evaluation
Was the diagnostic evaluation sufficiently comprehensive?
Was the diagnostic evaluation consistently applied to all patients?
Were the criteria for all candidate diagnoses explicit and credible?
Were the diagnostic labels assigned reproducibly?
Were there few patients left with undiagnosed conditions?
For patients with undiagnosed conditions, was follow-up sufficiently long and complete?
First, determine how comprehensive the investigators' diagnostic evaluation is. Ideally, the diagnostic evaluation would be able to detect all possible causes of the clinical problem, if any are present. For example, a retrospective study of stroke in 127 patients with mental status changes failed to include a comprehensive search for all causes of delirium, and 118 cases remained unexplained.9 Because the investigators did not describe a complete and systematic search for causes of delirium, the disease probabilities appear less credible.
Second, examine how consistently the diagnostic evaluation was performed. This does not mean that every patient must undergo every test. Instead, for many clinical problems, the clinician takes a detailed yet focused medical history and performs a problem-oriented physical examination of the involved organ systems, along with a few initial tests. Then, depending on the diagnostic clues from this information, further inquiry proceeds down one of multiple branching pathways. Ideally, investigators would evaluate all patients with the same initial evaluation and then follow the resulting clues using prespecified multiple branching pathways of testing. Once a definitive test result confirms a final diagnosis, further testing is unnecessary.
You may find it easy to decide whether the patients' illnesses have been thoroughly and consistently investigated if they were evaluated prospectively with a predetermined diagnostic approach. When clinicians do not standardize their investigation, this becomes harder to judge. For example, in a study of precipitating factors in 101 patients with decompensated heart failure, although all patients underwent a medical history taking and physical examination, the lack of standardization of subsequent testing makes it difficult to judge the thoroughness of the investigations.10
Third, examine the criteria for each disorder used in assigning patients' final diagnoses. Ideally, investigators will develop or adapt a set of explicit criteria for each underlying candidate disorder that could be diagnosed and then apply these criteria consistently when assigning each patient a final diagnosis. When possible, these criteria should include not only the findings needed to confirm each diagnosis but also those findings useful for rejecting each diagnosis. For example, published diagnostic criteria for infective endocarditis include criteria for verifying the infection and criteria for rejecting it.11,12 Investigators can then classify study patients into diagnostic groups that are mutually exclusive, with the exception of patients whose symptoms stem from more than 1 etiologic factor. Because a complete, explicit, referenced, and credible set of diagnostic criteria can be long, it may appear as an appendix or online-only supplement to the published article, such as in a study of patients with palpitations.13
While reviewing the diagnostic criteria, keep in mind that “lesion finding” is not necessarily the same thing as “illness explaining.” In other words, when using credible diagnostic criteria, investigators may find that patients have 2 or more disorders that might explain the clinical problem, causing some doubt as to which disorder is the culprit. Better studies of disease probability will include some assurance that the disorders found actually accounted for the patients' illnesses. For example, in a sequence of studies of syncope, investigators required that the symptoms occur simultaneously with an arrhythmia before that arrhythmia was judged to be the cause.14 In a study of chronic cough, investigators gave cause-specific therapy and used positive responses to this to strengthen the case for these disorders actually causing the chronic cough.7
Fourth, consider whether the assignments of the patients' final diagnoses were reproducible. Ensuring reproducibility begins with the use of explicit criteria and a comprehensive and consistent evaluation, as described above. Investigators can also use a formal test of reproducibility, such as chance-corrected agreement (κ statistic), as investigators did in a study of causes of dizziness.15 The greater the investigators' agreement beyond chance on the final diagnoses assigned to their patients, the more confident you can be in the resulting disease probabilities.
Fifth, look at how many patients' conditions remain undiagnosed despite the study evaluation. Ideally, a comprehensive diagnostic evaluation would leave no patient's illness unexplained, yet even the best evaluation may fall short of this goal. The higher the proportion of undiagnosed patients, the greater the chance of error in the estimates of disease probability. For example, in a retrospective study of various causes of dizziness in 1194 patients in an otolaryngology clinic, approximately 27% had undiagnosed conditions.16 With more than a quarter of patients' illnesses unexplained, the disease frequencies for the overall sample might be inaccurate.
Sixth, if the study evaluation leaves some patients' conditions undiagnosed, look at the length and completeness of their follow-up and whether additional diagnoses are made and the clinical outcomes are known. The longer and more complete the follow-up, the greater our confidence in the benign nature of the conditions in patients whose conditions remain undiagnosed and yet who are unharmed at the end of the study. How long is long enough? We suggest 1 to 6 months for symptoms that are acute and self-limited and 1 to 5 years for chronically recurring or progressive symptoms.
USING THE GUIDE
Hernandez et al2 described the consistent use of a standardized initial evaluation of medical history, physical examination, blood tests (blood cell counts, sedimentation rate, blood chemical analyses, protein electrophoresis, and thyroid hormone levels), urinalysis, and radiography (chest and abdomen), after which further testing was performed at the discretion of the attending physician. The authors do not list the diagnostic criteria for each disorder. For the patients' final diagnoses, the investigators required finding not only a disorder recognized in the literature to cause weight loss but also a correlation of weight loss with the clinical outcome of the disorder (recovery or progression). Two investigators independently judged final diagnoses and resolved disagreements (<5%) by consensus. An underlying disorder explaining involuntary weight loss was diagnosed for 221 patients (72%); therefore, 85 patients (28%) initially had undiagnosed conditions. During follow-up and repeated evaluations at 3, 6, and 12 months, 55 of these 85 patients were seen, and diagnoses were made for 41, leaving 14 unexplained diagnoses at 1 year and 30 patients lost to follow-up. Thus, the reported diagnostic evaluation appears credible, although some uncertainty exists because of unspecified criteria and the 10% loss to follow-up.