AUTOMATED IDENTIFICATION OF RECURRENT GASTROINTESTINAL BLEEDING USING ELECTRONIC HEALTH RECORDS AND LARGE LANGUAGE MODELS

Date

May 21, 2024

Background
Recurrent gastrointestinal bleeding (GIB) occurs in up to 20% of patients hospitalized with GIB and is a major cause of morbidity and mortality. Early identification of recurrent bleeding may improve patient management and outcomes. However, criteria for defining recurrent bleeding are complex and require close monitoring for a combination of symptoms (hematemesis, melena, hematochezia), vital signs, and/or hemoglobin levels. We propose an electronic health record (EHR)-based machine learning model for automated identification of recurrent bleeding after endoscopy in patients hospitalized with GIB.

Methods
We developed an automated EHR-based algorithm for recurrent bleeding based on six criteria proposed by international consensus guidelines (Figure 1). The algorithm was evaluated on a cohort of 1,114 patients who presented for acute GIB and underwent endoscopy from 2014 to 2023 at an academic medical center. Gold standard labels for recurrent bleeding were derived via manual EHR review. Heart rate, blood pressure, and hemoglobin were extracted from structured EHR tables. Overt signs of GIB were extracted from nursing notes using a hybrid traditional + large language model (Llama 2)-based natural language processing algorithm. Automated decision rule algorithms for each of the six criteria were evaluated individually and then combined in an ensemble that checks if any of the criteria are satisfied. Criteria 5 performed poorly individually and was excluded from the ensemble. We also evaluated a logistic regression model using the six criteria as inputs. Performance was reported as positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and F1 score (weighted average of PPV and sensitivity used as a metric of machine learning model performance; scale of 0-1 with 0 = worst and 1 = best).

Results
The ensemble decision rule had a PPV of 0.808, NPV of 0.993, sensitivity of 0.953, specificity of 0.969, and F1 score of 0.875 (Figure 2). The ensemble reported exact matches in the date of recurrent bleeding as the manual review in 89.0% of cases. Individually, Criterion 2 and 4, had the highest PPV (Criterion 2 = 1.000; Criterion 4 = 0.927). When applying a logistic regression to the output of each decision rule criteria, overall performance improved with PPV of 0.847, NPV of 0.991, sensitivity of 0.953, specificity of 0.969, and F1 score of 0.897.

Conclusion
An automated EHR-based machine learning algorithm can robustly and efficiently identify recurrent bleeding in GIB patients after endoscopy. This model allows ongoing, real-time monitoring for recurrent bleeding, providing the opportunity for more timely identification and intervention in these high-risk patients with less utilization of personnel and resources than in current clinical practice.

<b>Figure 1. </b>Criteria for recurrent gastrointestinal bleeding adapted from international consensus guidelines (Laine et. al 2010).

Figure 1. Criteria for recurrent gastrointestinal bleeding adapted from international consensus guidelines (Laine et. al 2010).

<b>Figure 2. </b>Performance of automated decision rule and machine learning algorithms for recurrent gastrointestinal bleeding.

Figure 2. Performance of automated decision rule and machine learning algorithms for recurrent gastrointestinal bleeding.

Presenter

Neil Zheng

Speakers

Yale School of Medicine

Dennis Shung

Yale University School of Medicine

Tracks

AGA

Related Products

IDENTIFYING OVERT SIGNS OF ACUTE GASTROINTESTINAL BLEEDING IN THE ELECTRONIC HEALTH RECORD WITH LARGE LANGUAGE MODELS

Early identification of overt signs of GIB (melena, hematochezia, hematemesis) in hospitalized patients may enable expedited evaluation for inpatient endoscopy…

ARTIFICIAL INTELLIGENCE AND NONHUMAN "AUTHORS"

IMPACT OF ARTIFICIAL INTELLIGENCE SYSTEMS FOR UPPER GASTROINTESTINAL BLEEDING ON CLINICIAN TRUST AND LEARNING USING LARGE LANGUAGE MODELS: A RANDOMIZED PILOT SIMULATION STUDY

Artificial intelligence (AI)-based risk stratification systems in upper gastrointestinal bleeding (UGIB) outperform existing risk scores, but successful implementation of such systems into practice requires acceptance and trust of the technology by clinicians…

VONOPRAZAN FOR THE TREATMENT OF HEARTBURN IN NON-EROSIVE REFLUX DISEASE: A RANDOMIZED TRIAL

BACKGROUND: Proton pump inhibitors (PPIs) are the mainstay of therapy for non-erosive reflux disease (NERD). Potassium-competitive acid blockers (PCABs) have more potent and rapid inhibition of gastric acid secretion than PPIs…