Improving Frailty Screening In Older Adults With An Llm-augmented Electronic Frailty Index: A Retrospective Analysis

Dalal, AnujK

IMPROVING FRAILTY SCREENING IN OLDER ADULTS WITH AN LLM-AUGMENTED ELECTRONIC FRAILTY INDEX: A RETROSPECTIVE ANALYSIS

John Lundstedt, BS¹, Julia V. Loewenthal, MD², Savanna Plombon², Marie Leeson², Natalie C. Ernecoff, PhD, MPH³, Joseph Plasek, PhD⁴, Li Zhou, MD, PhD², Stuart Lipsitz, SC.D.², Anuj K. Dalal, MD⁵, ¹Brigham and Women's Hospital, MA, Boston; ²Brigham and Women's Hospital, ; ³RAND, ; ⁴Mass General Brigham, ; ⁵Harvard Medical School; Brigham and Women's Hospital,

Meeting: SHM Converge 2025

Abstract Number: 0008

Session Type: Oral Presentations

Category: Research

Sub-Category: Technology in Hospital Medicine

Keywords: Electronic Frailty Index, Frail, Frailty Screening, Large Language Model, Unstructured Data

Background: Frailty is linked to poor outcomes in older patients, especially those with multiple conditions. The electronic frailty index (eFI) is a validated tool based on the cumulative deficit model used to screen hospitalized patients at risk for poor outcomes1. The eFI relies on 35 ICD-10 codes associated with encounters across four domains (morbidity, sensory, cognitive, functional).2 However, some diagnoses, like gait abnormality in the functional domain, are often missing,3 potentially underestimating eFI scores. Physical therapist (PT) and case management (CM) notes typically include functional status assessments in unstructured text, which could identify these missing diagnosis codes. This study aims to determine if prompting Large Language Models (LLMs) like ChatGPT to identify functional status diagnosis codes from PT and CM assessments improves eFI calculation in older adults with multiple conditions likely to be frail.

Methods: We conducted a retrospective analysis of hospitalized patients (>51 years) from two studies (Table 1, footnote). We collected demographic information, ICD-10 diagnosis codes from ambulatory encounters in the 24 months before the hospital encounter, PT and CM notes from the hospital encounter, and 90-day post-discharge data (readmission, disposition, mortality) from the EHR (Epic Systems, Inc.). We prompted GPT-3.5 and GPT-4 in Azure AI Studio (Microsoft, Inc.) to identify functional diagnoses from unstructured PT and CM documentation (Figure 1). We calculated eFIs using only ambulatory encounter diagnoses, and then with functional diagnoses identified by GPT. Descriptive statistics were used to report demographic characteristics and post-discharge events for the overall cohort and frail cases (eFI > 0.2). We compared the frailty rates identified by the unenhanced eFI and GPT-enhanced eFI using bivariate statistics, and analyzed post-discharge events (rehospitalization, institutionalization, death) within 90 days for frail and non-frail cases using multivariable analyses.

Results: The characteristics of the 616 included cases (Table 1a) and frail cases (unenhanced, n=91; GPT-3.5, n=323; GPT-4, n=247) were similar. In our cohort, 320 cases (51.9%) had at least one post-discharge event (Table 1b) within 90 days after hospitalization. An example of the prompt, a truncated PT note, and GPT output are shown in Figure 1. More frail cases were identified using LLM-enhanced eFIs (GPT-3.5: 52.4% vs. 14.8%, p< 0.01; GPT-4: 40.1% vs 14.8%, p< 0.01) compared to unenhanced eFI. Post-discharge events were significantly more frequent in frail cases compared to non-frail cases for all eFIs (unenhanced: 17.8% vs 11.5%, p=0.03; GPT-3.5: 58.8% vs 35.5%, p< 0.01; GPT-4: 61.1% vs 32.43%, p< 0.01). Adjusted analyses showed frailty was associated at least one post-discharge event for each eFI (unenhanced: OR 1.13 [1.01, 1.26], p=0.04; GPT-3.5: OR 1.16 [1.07, 1.26], p< 0.01; GPT-4: OR 1.15 [1.06, 1.25], p< 0.01).

Conclusions: In older hospitalized patients, LLM-enhanced eFIs identified significantly more frail cases by detecting functional deficits in unstructured PT and CM documentation. This approach may improve care planning and facilitate comprehensive frailty screening for older adults as required by CMS’s Age-Friendly Hospital Measure. Future research should validate these findings in larger samples with diverse populations through prospective studies.

IMAGE 1: Table 1. Characteristics and 90-day post-discharge events for overall cohort and frail cases identified using eFI and LLM-enhanced eFIs

IMAGE 2: Figure 1: Example prompt, truncated physical therapy note, and functional diagnoses identified by GPT

By webshmorg|2025-04-22T23:58:48-05:00April 22nd, 2025|

To cite this abstract:

John Lundstedt, BS¹, Julia V. Loewenthal, MD², Savanna Plombon², Marie Leeson², Natalie C. Ernecoff, PhD, MPH³, Joseph Plasek, PhD⁴, Li Zhou, MD, PhD², Stuart Lipsitz, SC.D.², Anuj K. Dalal, MD⁵.

IMPROVING FRAILTY SCREENING IN OLDER ADULTS WITH AN LLM-AUGMENTED ELECTRONIC FRAILTY INDEX: A RETROSPECTIVE ANALYSIS.

Abstract published at SHM Converge 2025.

Abstract 0008

Journal of Hospital Medicine.

https://shmabstracts.org/abstract/improving-frailty-screening-in-older-adults-with-an-llm-augmented-electronic-frailty-index-a-retrospective-analysis/.

March 25th 2026.

IMPROVING FRAILTY SCREENING IN OLDER ADULTS WITH AN LLM-AUGMENTED ELECTRONIC FRAILTY INDEX: A RETROSPECTIVE ANALYSIS

<< Go back

This Week

This Month

All Time

This Week

This Month

All Time