Background
Recurrent gastrointestinal bleeding (GIB) occurs in up to 20% of patients hospitalized with GIB and is a major cause of morbidity and mortality. Early identification of recurrent bleeding may improve patient management and outcomes. However, criteria for defining recurrent bleeding are complex and require close monitoring for a combination of symptoms (hematemesis, melena, hematochezia), vital signs, and/or hemoglobin levels. We propose an electronic health record (EHR)-based machine learning model for automated identification of recurrent bleeding after endoscopy in patients hospitalized with GIB.
Methods
We developed an automated EHR-based algorithm for recurrent bleeding based on six criteria proposed by international consensus guidelines (Figure 1). The algorithm was evaluated on a cohort of 1,114 patients who presented for acute GIB and underwent endoscopy from 2014 to 2023 at an academic medical center. Gold standard labels for recurrent bleeding were derived via manual EHR review. Heart rate, blood pressure, and hemoglobin were extracted from structured EHR tables. Overt signs of GIB were extracted from nursing notes using a hybrid traditional + large language model (Llama 2)-based natural language processing algorithm. Automated decision rule algorithms for each of the six criteria were evaluated individually and then combined in an ensemble that checks if any of the criteria are satisfied. Criteria 5 performed poorly individually and was excluded from the ensemble. We also evaluated a logistic regression model using the six criteria as inputs. Performance was reported as positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and F1 score (weighted average of PPV and sensitivity used as a metric of machine learning model performance; scale of 0-1 with 0 = worst and 1 = best).
Results
The ensemble decision rule had a PPV of 0.808, NPV of 0.993, sensitivity of 0.953, specificity of 0.969, and F1 score of 0.875 (Figure 2). The ensemble reported exact matches in the date of recurrent bleeding as the manual review in 89.0% of cases. Individually, Criterion 2 and 4, had the highest PPV (Criterion 2 = 1.000; Criterion 4 = 0.927). When applying a logistic regression to the output of each decision rule criteria, overall performance improved with PPV of 0.847, NPV of 0.991, sensitivity of 0.953, specificity of 0.969, and F1 score of 0.897.
Conclusion
An automated EHR-based machine learning algorithm can robustly and efficiently identify recurrent bleeding in GIB patients after endoscopy. This model allows ongoing, real-time monitoring for recurrent bleeding, providing the opportunity for more timely identification and intervention in these high-risk patients with less utilization of personnel and resources than in current clinical practice.

Figure 1. Criteria for recurrent gastrointestinal bleeding adapted from international consensus guidelines (Laine et. al 2010).
Figure 2. Performance of automated decision rule and machine learning algorithms for recurrent gastrointestinal bleeding.