Background: In 2016, the Sepsis-3 taskforce posited that acute organ failure is the defining feature of sepsis. Accordingly, they recommended that an acute rise in the Sequential Organ Failure Assessment (SOFA) score by 2 points over baseline should replace the Systemic Inflammatory Response Syndrome (SIRS) score as the sepsis criteria (1). As a justification, they reported that SOFA was a better predictor of outcomes than SIRS in a study of 1.3 million encounters (2). But, in such large cohorts, it is impossible to discriminate between acute and chronic organ failure. So, their study labelled as septic any patient with SOFA > 2, whether it was an acute rise or just a chronic baseline. The assumption was that eliminating the distinction between acute and chronic organ failure would not adversely affect the validity of results. This assumption is necessitated by a “big data” approach and is common to most studies of Sepsis-3. Yet, it has never been tested. We sought to study the impact of this assumption and compare an acute rise in SOFA to past sepsis criteria (SIRS) (3,4) and to the national early warning score (NEWS) which has been reported to work well in medical acute-care sepsis (5).

Methods: We identified all inpatients treated for an infection on medical acute-care wards of a 600 bed academic medical center over one year (n = 1864). With rigorous chart reviews and queries to our data warehouse, we determined the pre-infection SOFA baseline, the acute rise in SOFA, SIRS, qSOFA and NEWS scores at the onset of infection (up to 2 days before and 1 day after initial antibiotic dose). Inter-rater reliability of chart reviews was measured with Krippendorff’s alpha. Figure 1 details this process. We compared the predictive validity of these criteria using measures of discrimination for outcomes (primary: mortality – 1.8%; secondary: ICU transfer or mortality – 8.3%) such as fold change in rates of outcome and area under receiver operating characteristic curve (AUROC).

Results: Acute organ failure (SOFA rise by 2 over baseline) showed poor predictive validity for mortality. The fold change in outcome was not significantly higher than 1 (1.76; 95% CI: 0.89-3.96). And, when added to a baseline risk model (age, gender, race & Charlson comorbidity index), the rise in AUROC was minimal (0.67 for SOFA vs. 0.66 for baseline model; p – 0.74). But, when all SOFA points (acute or chronic) were analyzed, the performance improved (fold change in outcome – 5.29; 95% CI: 1.81-18.76; AUROC 0.70 for SOFA vs. 0.66 for baseline model, p – 0.04). Also, we found that the NEWS score (sepsis defined as NEWS ≥ 6) performed better than other criteria (fold change in outcome – 5.23; 95% CI: 2.12-23.9; AUROC 0.74). Figure 2 shows our results. Results were similar for the secondary composite outcome of ICU transfer or mortality.

Conclusions: In our study, failure to distinguish between acute and chronic organ failure resulted in biased overestimates of the prognostic value of Sepsis-3. If the predictive validity of the SOFA score relies heavily on chronic organ failure, then its use as the operational definition for an acute syndrome like sepsis may not be as justifiable as was previously assumed. Also, the NEWS score, which does not rely solely on organ failure, predicted outcomes better than acute organ failure in our medical acute-care cohort. This raises fundamental questions about the conception that acute organ failure is the defining feature of sepsis. These findings suggest that a more rigorous scrutiny of the Sepsis-3 taskforce recommendations is warranted.

IMAGE 1: Figure 1: Data collection and cohort characteristics.

IMAGE 2: Figure 2. Comparing predictive validity.