Detailed Evaluation Questions for Primary Papers

Updated December 07, 2006

Questions to Apply to Structured Abstracts:

(adapted from Ann Int Med (1990)113:69-76, (1993)118:731-737, and Can Med Assoc (1994) 150:1611-1615 )

The abstract is crucial as it often the only section beyond the title that is read by the busy clinician. The following are the recommended headings for structured abstracts and are questions directed at the content of the abstract. Note that narrative abstracts of clinical studies should contain all of these points as well but they are harder to identify because of the absence of the headings.


  • Is the primary question or objective of the study clearly and explicitly stated?


  • Is the type of study design given (e.g. RBCT (randomized blinded controlled trial), cohort, case-control, survey, case series)?
  • Are the critical technical points for the design included (e.g., random allocation, stratified random sampling, convenience, blinding, reference or "gold" standard)?


  • Is the type of clinical (primary, referral, or academic), laboratory, or field setting and level of care given? Does this setting differ from mine sufficiently that study generalizability is reduced?

Patients or Subjects:

  • Are the number of subjects, the technical descriptors (random, systematic or haphazard) of how they were selected, the duration of enrollment, and the number of refusals or non-responders given?
  • Are their key demographic characteristics, (age, sex, breed) and the spectrum of the clinical disorder sufficiently described? Are the subjects similar enough to those in my clinical setting?
  • If an RBCT (Randomized, blinded, controlled trial), are inclusion and exclusion criteria given?
  • If matching was used, are the criteria stated?
  • If follow-up is involved, is the duration of follow-up given and the number lost or the number withdrawn due to adverse effects given?
  • Are the method, duration, dosage, and common clinical names of the drugs in question given?
  • Are the unfamiliar procedures in question briefly described?
Measurements and Main Results:
  • Are clinically relevant measurements and main results given? Are they briefly described if unfamiliar to the intended readership?
  • Are the methods of controlling bias given for subjective measurements?
  • Are 95% confidence limits and assessment of chance (p-values) given for numerical results?
  • Are implications of the results and their clinical application including limitations clearly stated?
  • Are the conclusions consistent with the results?
  • If further study is needed before the results are applicable in the clinical setting, is this noted?

[Return to Section Contents]                          [Return to Contents List]

Questions to Apply to Primary Papers:

(adapted from Am J Physiology (1995)13:S21-S25, Med J Australia (1992) 157:389-394 and BMJ (1991)302:1136-1140)


(see "Questions to Apply to Structured Abstracts above)


  • Is sufficient (but not excessive) background material present with references to provide a conceptual and theoretical basis for the study and to indicate why the study question is worth addressing?
  • Are the relevant favorable (supporting) and unfavorable (non-supporting) papers cited?
  • Is the exact question or hypothesis and specific aims being addressed clearly stated with measurable objectives? Are they clinically relevant?
Materials and Methods:
  • Is the study design and laboratory methods appropriate to address the study question? Are limitations and potential problems with study methods acknowledged in this or the discussion section?
  • Are references for standard methods or "gold" standards provided and are modifications described in sufficient detail that others familiar with the study area could repeat or extend the study?
  • If new methods or instrumentation are used, are they completely described so others can replicate it? Are the repeatability and reliability of new methods and instruments assessed?
  • Is symmetry established and preserved where possible in the design and execution of the study (controls, blinding, randomization)?
  • Are the methods (e.g., blinding, replication) for assuring the validity, reproducibility, blinding, and quality control of measurements addressed, particularly for subjective clinical observations?
  • If sampling from a group is involved is the source population and the spectrum of disease sufficiently described (i.e., can you determine if a given patient of yours would or would not be eligible for the study)? Are the sampling procedures, inclusion / exclusion criteria, and the sample size sufficient?
  • If controls are involved, is the method of selecting controls in observational studies or for allocating controls in experimental studies sufficient? Are the two groups sufficiently comparable at the outset? If controls aren’t used, why not?
  • Are company sources of drugs, chemicals, and devices given? Do any present a strong potential for conflict of interest?
  • Are appropriate statistical procedures used with references given for those that are less common?


  • Are the outcomes or endpoint appropriate for the clinical study question? Are clinically important side effects addressed?
  • Are the results presented in a fashion (clear tables, graphs or text) such that they make sense? Is information about the distributions of relevant variables presented?
  • Are potential confounders or the effects of other co-treatments accounted for sufficiently?
  • Are deviations from the research protocol addressed (e.g., unexpected death of subjects, breaking of blinding, miss-labeling of samples)?
  • Are results analyzed statistically? Are the statistical procedures executed correctly?
  • Are statistically significant results also clinically or biologically significant? Is the distinction between statistical significance and clinical or biological significance clear?
  • If the results are statistically insignificant, is the power of the study to obtain the minimum biologically or clinically significant results given?
  • If subjects are followed over time, is the duration and degree of follow-up sufficient and are losses to follow-up explained sufficiently?
  • Could selection bias, measurement bias, confounding bias, or chance better account for the results than what is stated?
  • Is there evidence of "data dredging" (desperately searching for significance in a sea of insignificance)?
Discussion and Conclusions:
  • Does the discussion logically relate the findings to a sufficient range of previously published information? Are important recent primary references, particularly contrary, included in this? Are the citations comprised of a minimal amount of unpublished or weak primary information (e.g., meeting abstracts and theses)?
  • Are limitations of the study methods addressed?
  • If findings differ from previously published findings, are the potential reasons for the discrepancy discussed sufficiently?
  • If results are not statistically significant, is the minimum biologically or clinically significant effect that the study had the power to detect discussed?
  • Are the potential effects of protocol deviations discussed?
  • If a survey, is the proportion of non-respondents and their similarity to respondents addressed? If a prospective study, are the potential effects of the loss to follow up addressed?
  • Is the relevance of the findings discussed and a minimum of unwarranted speculation or overgeneralization presented? Is the degree of generalization appropriate for the size, setting, and strength of the study?
References and Acknowledgements:
  • Are recent, relevant primary papers cited? Are the ones on which the work is based, such as those describing the materials and methods used and those on which interpretation of the results rests, primary papers from refereed scientific journals?
  • Are sources of support acknowledged? Do any of these present a potential conflict of interest?

[Return to Section Contents]                          [Return to Contents List]

Additional Questions to Apply to Specific Study Designs:

(Modified from pages 267-270 in: Fletcher RH, Fletcher SW, Wagner EH (1996). Clinical Epidemiology: The Essentials, 3rd, Williams and Wilkins ISBN 0-683-03269-0.)

Cross-sectional (Prevalence) Studies:

  • Are the criteria for being a case clearly described? What are they? For example, titer presence may indicate prior infection, passive protection, active protection from vaccination but it may or may not indicate clinical disease.
  • What is the population in which the cases were found? Is this population likely to have a higher or lower prevalence than other populations of interest to the clinician?
  • Are the study subjects an unbiased sample of the population to which the results are applied?

Case-Control Studies:

  • Were cases entered at the onset of disease (incident cases rather than prevalent cases)?
  • Were controls similar to cases in other important respects (e.g., age, breed) except for the exposures of interest?
  • Was the ascertainment of exposure information similar in the cases and controls, preserving symmetry (Recall bias, measurement bias from non-blinded observers)?
  • Do the subjects represent a disease spectrum (acute vs. chronic, mild vs. severe) similar to that of your clients?

Cohort Studies and Randomized Clinical Trials:

  • Were study subjects all:
    1. Entered at a clinically useful point (inception cohort), such as after first diagnosis of the problem or after initial treatment? Enrolling subjects after followup time has passed means that prevalent rather incident cases are enrolled, likely missing those that either do very well (recover rapidly and completely) or very poorly (die).
    2. At risk of developing the outcome? Those subjects who already have the condition or can’t acquire it (e.g. immune from previous exposure, physiologically incapable of acquiring the condition – such as pregnancy) should not be in the study.
    3. At a similar point in the course of the exposure or disease? Prognosis often varies with duration of an exposure or the time since exposure or the disease occurred.
  • Was the followup of all subjects complete? Those lost to followup can bias a study if they do better or worse than those that remain and more are lost from one group than another.
  • Were all subjects examined with equal intensity? Were observers doing the measurements blinded, preserving symmetry and reducing systematic measurement bias?
  • Were other factors affecting the outcome other than the one of interest (e.g., age, gender, breed) either equally distributed in the outcome groups or controlled by the method of analysis? If not, could the difference in these other factors account for the outcome observed?
Randomized Trials: (In addition to above)
  • Were subjects randomly allocated to treatment and control groups? Were they allocated as individuals or as groups?
  • Are the results from subjects analyzed on the same basis that the subjects were allocated to treatments? If allocated by group membership, are the results analyzed by group rather than by individuals (error of inflated precision or pseudoreplication)?
  • Were the observers (e.g., owners, clinicians, researchers, technicians) blinded (masked) to subject’s group allocation?
  • Were the co-interventions (e.g., other treatments) the same in both groups, preserving symmetry?

[Return to Section Contents]                          [Return to Contents List]