Background: Inappropriate utilization of daily labs—complete blood counts (CBC) and serum electrolyte panels (SEP)—is an important cause of increased costs in the hospital setting. The Minnesota Lab Appropriateness (MLAB) criteria were previously developed to facilitate assessment of CBC and SEP appropriateness (1). A combination of clinical judgment and common healthcare data, including recent vital signs and lab results, are used to apply the criteria. Our goal was to develop an algorithm based on the MLAB criteria capable of retrospectively evaluating lab tests for appropriateness.

Methods: The MLAB criteria were translated to formulas as directly as possible. Criteria components relying on clinical judgment were excluded. Standard upper and lower limits of normal (ULN, LLN) were used to define vital sign ranges, with the exception of blood pressure, for which the definition of hypertensive crisis (180/120mmHg) was used for ULN. For laboratory values, the ULN and LLN were defined by our institution’s reference ranges. Formulas were then formatted into a macro-enabled Excel document using formula functions. Twenty hospitalizations (106 CBCs, 109 SEPs) were evaluated for appropriateness by the algorithm. A physician reviewer independently rated appropriateness of each CBC and SEP using the full MLAB criteria, including the clinical judgment components. Appropriateness was graded on a dichotomous (appropriate/inappropriate) scale. Algorithm and physician reviewer results were analyzed for interrater reliability, sensitivity, and specificity.

Results: The algorithm identified 3% of CBCs and 5% of SEPs as inappropriate, while the physician reviewer using the MLAB criteria identified 29% of CBCs and 11% of SEPs as inappropriate (p<0.001 and p=0.075, respectively). Cohen’s kappa was 0.47 for CBCs and 0.76 for SEPs between the algorithm and physician reviewer. For CBCs, the algorithm had sensitivity of 10% and specificity of 100% for identifying inappropriate labs compared to the physician reviewer. For SEPs, sensitivity was 17% and specificity was 97%.

Conclusions: This first iteration of an algorithm for retrospectively identifying inappropriate CBCs and SEPs showed excellent specificity but poor sensitivity compared to a physician applying the MLAB criteria, a previously published approach for assessing lab appropriateness. Interrater reliability was moderate for the CBC component and substantial for the SEP component. The discrepancy between the algorithm and physician reviewer was largely attributable to slightly abnormal vital signs and lab results, such as a heart rate of 105bpm in a patient with post-operative pain. Such values were often interpreted by the physician reviewer as unremarkable but flagged by the algorithm as abnormal, thus warranting future lab orders. Next steps include refining the formulas to improve sensitivity by considering alternative vital sign/lab value ranges, testing the algorithm on a larger data set, comparing algorithm performance versus multiple physician reviewers, and incorporating the tool into quality monitoring workflow at our institution.