BACKGROUND: The risk of malnutrition is increased among patients with inflammatory bowel disease (IBD) due to several mechanisms: reduced and selective oral intake, nutrient malabsorption, and inflammation-mediated catabolism. Patients hospitalized for IBD with protein-calorie malnutrition (PCM) possess an especially high risk for morbidity and mortality. Accurate risk stratification is thus critical to help identify patients at highest need for closer monitoring and assertive intervention to reduce the risk of mortality. As such, this study aimed to develop a machine learning model that accurately predicts mortality in hospitalized IBD patients with PCM.
METHODS: Hospitalized adults with IBD and PCM were identified in the 2016-2019 National Inpatient Sample (NIS) using International Classification of Diseases, 10th Revision (ICD-10) codes. Features of interest included patient-, admission-, and hospital-related factors. Random decision forests were used to select additional important comorbidities and Clinical Classifications Software Refined (CCSR)-defined categories for diagnoses and procedures in a 70% randomly sampled training set for years 2016-2018. Random Forest (RF) and Extreme Gradient Boosting (XGB) models were then generated using unweighted records in the 2016-2018 training set, tested using 30% of the remaining 2016-2018 data, and externally validated using the entire 2019 data. Patient characteristics were evaluated using weighted estimates that accounted for the complex sampling design of the NIS.
RESULTS: Among 879,730 malnourished patients hospitalized for IBD, 1930 (0.2%) died. Compared with malnourished patients who survived, those who died were generally older (80.2% vs. 29.8% ≥60 years old, P<0.001), White (83.7% vs. 75.9%, P<0.001), had ulcerative colitis (51.6% vs. 36.8%, P<0.001) with multiple comorbidities (16.7% vs. 2.2%, P<0.001), and admitted on the weekend (27.2% vs. 20.1%, P<0.001). The accuracy and area under the receiver operating characteristic curve (AUROC) of the RF model with the external validation set were 0.99 and 0.95, respectively (Figure). The accuracy and AUROC of the XGB model with the external validation set were 0.99 and 0.93, respectively. Despite the imbalanced event rate among patients, precision and recall for both machine learning models were 0.98 and 0.99, respectively.
CONCLUSION: The machine learning models had excellent performance at accurately predicting mortality, while solely relying on readily available clinical data. Further research on the integration of these semi-automated tools into clinical practice could improve risk stratification of IBD patients with PCM and potentially reduce mortality rates in this high-risk population.

Figure. Area under receiver operating characteristic curves (AUROC) for Random Forest (Model 1) and Extreme Gradient Boosting (Model 2) models with the external validation set.