INTERPRETABLE MACHINE LEARNING PREDICTIVE MODEL TO UNDERSTAND RISK FACTORS FOR EARLY ONSET COLORECTAL CANCER

Date

May 20, 2024

Purpose: Colorectal cancer (CRC) stands as the third most prevalent cancer in the USA, presenting a critical health concern. This paper introduces a novel explainable network-based AI model tailored to identify high-risk CRC patients under 50, particularly those without a family history.
Methods: Machine learning often requires navigating a delicate balance between accuracy and intelligibility. Highly accurate models such as boosted trees, random forests, and neural networks tend to lack interpretability, creating challenges in critical domains like healthcare where model understanding and trust are paramount. In this study, we adopt high-performance generalized additive models with pairwise interactions (GA2Ms) to address this challenge, thereby showcasing AI models that strike a balance between accuracy and intelligibility. The model's intelligibility and modularity enable the identification and elimination of these patterns. It is also called a glass-box model because of its highly intelligible nature and explainable capabilities. There are two main aspects of explainability in a machine-learning model: global and local explanation. The global explanation provides an overall summary of the importance of different features in predictions or classification. In contrast, the local explanation provides interpretability at each instance level. The EBM model provides global and local explanations and, therefore, overcomes the shortcomings of the black-box models (fig 1).
Results: After data cleaning and selection steps, we had 7507 CRC patients and 37,831 non-CRC patients from the cerner database. For the under-50 age group, our model achieved a correct prediction rate of 73.2%, with an impressive area under the ROC curve of 0.81. At a sensitivity of 50%, the false positive rate stood at 11.5%, showcasing superior performance compared to existing state-of-the-art models. The proposed variables, capturing the intricate interactions between multiple diseases, not only enhance colorectal cancer prediction but also offer adaptability for future predictions across various health conditions.
Conclusion: This network-based model demonstrates exceptional promise in identifying younger individuals at heightened risk of colorectal cancer, particularly those lacking a family history.he Explainable Boosting Machine (EBM) adopted in our study. It is a tree-based, cyclic gradient-boosting Generalized Additive Model (GAM) with automatic interaction detection. he utilization of GA2Ms successfully addresses the accuracy-intelligibility trade-off in machine learning, crucial in healthcare applications where interpretability is paramount.

Local and Global Intelligent models

Speakers

Pankush Kalgotra

ramesh Sharda

Rich Caruana

Presenter

Sravanthi Parasa

Swedish Medical Center

Tracks

AGA

Related Products

AGA Current Status of Artificial Intelligence in Gastroenterology

SOCIETY: AGA

ARTIFICAL INTELLIGENCE AND MACHINE LEARNING: GENERALIZABLE?

The use of technology has expanded rapidly in clinical care and research in gastroenterology. Telehealth has improved access to care and transformed the ability to monitor patients from the safety of their own homes…

GOING BEYOND CAD TOWARDS QUALITY

STATE-OF-THE ART PRESENTATION: ETHICAL CONSIDERATION IN THE IMPLEMENTATION OF AI IN GI