1017

FEASIBILITY OF AUTOMATED MACHINE LEARNING USING PUBLIC DATA FOR USE IN ENDOSCOPY

Date
May 9, 2023
Explore related products in the following collection:

Society: AGA

Background & Aim:
For training deep learning algorithms in medical imaging, application-specific data is often scarce. Deep learning systems are therefore generally pretrained on large publicly available labeled data sets of general imagery, unrelated to the envisioned application, to have the algorithm learn basic features from these widely available data followed by refinement training on the generally scarce application-specific images. Pretraining might be more effective if the images for pretraining resemble the envisioned application, i.e., domain-specific pretraining. We investigated if pretraining on general endoscopic imagery results in a better performance of five existing AI systems with an application in gastro-intestinal endoscopy, compared to current state-of-the art pretraining approaches (i.e., supervised pretraining with ImageNet and semi-weakly supervised pretraining with the Billion-scale data set).

Methods:
Our group has created an endoscopy-specific dataset called GastroNet for pretraining deep learning systems in endoscopy. GastroNet consists of 5,084,494 endoscopic images retrospectively collected between 2012 and 2020 in seven Dutch hospitals. We created four pretrained models: one using GastroNet and three using ImageNet and/or the Billion-scale data set. The pretraining method was either supervised, self-supervised, or semi-weakly supervised. The pretrained models were subsequently trained towards five independent, commonly used applications in GI endoscopy, using their original application-specific datasets. Outcome parameters were: 1) classification and/or localization performance of the five trained applications; 2) change in performance when the number of available application-specific training data was reduced, to investigate a possible difference in performance drop for the different pretrained models. The different combinations of pretraining data & method, test sets and downstream task are visualized in Figure 1.

Results:
Overall, the domain-specific pretrained model resulted in a statistically superior performance for the five different GI applications. More detailed results are presented in Table 1. The superiority was also reflected in a smaller drop in performance when the number of application-specific training data were reduced artificially.

Conclusion:
Domain-specific pretraining, using unlabeled general endoscopic images, is superior to current state-of-the-art pretraining approaches for developing deep learning algorithms in GI endoscopy. It also allows more effective use of the generally scarce application-specific endoscopy images. These findings might cause a paradigm shift in the development of AI systems in endoscopy.
<i>Figure 1. Flow diagram of different pretraining methods, data sets and downstream tasks.</i>

Figure 1. Flow diagram of different pretraining methods, data sets and downstream tasks.

<i>Table 1. Overview of performance of the five different application-specific data sets using four different pretrained models. Cells highlighted in green represent the highest scoring pretrained model per application-specific data set. </i>

Table 1. Overview of performance of the five different application-specific data sets using four different pretrained models. Cells highlighted in green represent the highest scoring pretrained model per application-specific data set.

Introduction
With recent successful applications of computer vision in gastroenterology and endoscopy, there has been strong interest among physicians to develop practical skills in artificial intelligence. Automated Machine Learning (AutoML) platforms may increase access to complex deep learning algorithms that may otherwise be inaccessible and allow physicians to build complex models for a variety of use-cases simply by providing labeled data.

We focused on three commonly used AutoML platforms created by Microsoft, Amazon, and Google that market their ability to create image classification and object detection models. Using labeled data from the publicly available SUN[1] colonoscopy data set, we developed computer aided diagnosis (CADx) and computer aided detection (CADe) models on all three AutoML platforms.

Methods
The dataset used to evaluate model performance is the SUN (Showa University and Nagoya University) Colonoscopy Video Database. To create the models, the data were uploaded to the respective platforms and the annotation files were parsed into a format readable by the platform. The dataset was split 70/10/20 for training, validation, and testing. We used metrics including sensitivity, specificity, PPV, NPV, F1, AuROC, accuracy, precision, and recall to evaluate the CADx models. CADe models were evaluated using precision, recall, and F1 score. We used analysis of variance (ANOVA) testing with an alpha of 0.05 to determine if the performance of each CADx model was different across platforms.

Results
The sensitivity of the three CADx models was 0.9996, 0.9801, and 0.9770 for Microsoft, Google, and Amazon respectively. The specificity was 0.9993, 0.9665, and 0.9633. There was a statistically significant difference in the performance of the three CADx models. The F1 scores of the models built using Microsoft, Google, and Amazon platforms were 0.9996, 0.9800, and 0.9768 respectively (P=0.0044). The F1 scores for the CADe models made by the Microsoft, Google, and Amazon platforms (using an IoU threshold of 0.5), were 0.9929, 0.9650, and 0.8980 respectively.

Conclusions
Using minimal coding, we were able to create three algorithms, which were all
able to achieve high F1 accuracy scores (> 0.9) on CADe and CADx use-cases. There was a statistically significant difference in the F1 accuracy of the models created by the AutoML platforms. Further analysis on larger datasets and on different landmarks is needed to demonstrate if the Microsoft AutoML consistently performs best on all endoscopic computer vision tasks. AutoML platforms represent a practical entry point for endoscopists interested in exploring computer vision for GI endoscopy and may be an important catalyst for physician-driven innovation.

[1] SUN Colonoscopy Video Database. Hayato Itoh, Masashi Misawa, Yuichi Mori, Masahiro Oda, Shin-Ei Kudo, Kensaku Mori, 2020, http://amed8k.sundatabase.org/
<u>Table 1: CADx Model Performance on Testing Dataset</u>

Table 1: CADx Model Performance on Testing Dataset


Tracks

Related Products

Thumbnail for PLACEHOLDER
PLACEHOLDER
Perturbations in the gut mucosal immune response contributes to IBD. Non-immune cell popopulations including epithelial and stromal cells also play an important role in intestinal inflammation…
Thumbnail for CLOSED LOOP SMALL BOWEL OBSTRUCTION AFTER FULL THICKNESS DEFECT CLOSURE WITH THE ENDOSCOPIC HELIX TACKING SYSTEM
CLOSED LOOP SMALL BOWEL OBSTRUCTION AFTER FULL THICKNESS DEFECT CLOSURE WITH THE ENDOSCOPIC HELIX TACKING SYSTEM
BACKGROUND: Endoscopic defect closure have been applied to reduce the adverse events rate after colorectal endoscopic submucosa dissection (ESD). As for the suturing device, the closure has usually been performed using through-the-scope clips (TTSC)…
Thumbnail for MONITORING OF CALPROTECTIN IN INFLAMMATORY BOWEL DISEASE USING A SWEAT BASED WEARABLE DEVICE
MONITORING OF CALPROTECTIN IN INFLAMMATORY BOWEL DISEASE USING A SWEAT BASED WEARABLE DEVICE
Switching from originator to biosimilar infliximab (IFX) is effective and safe. However, data on multiple switching are scarce. The Edinburgh IBD unit has undertaken three switch programmes: (1) Remicade to CT-P13 (2016), (2) CT-P13 to SB2 (2020), and (3) SB2 to CT-P13 (2021)…
Thumbnail for ENCAPSULATED MICROBIOTA TRANSPLANT THERAPY IMPROVES PARTIAL MAYO SCORES IN ULCERATIVE COLITIS AND INDUCES RAPID ENGRAFTMENT COMPARED TO PLACEBO CONTROL
ENCAPSULATED MICROBIOTA TRANSPLANT THERAPY IMPROVES PARTIAL MAYO SCORES IN ULCERATIVE COLITIS AND INDUCES RAPID ENGRAFTMENT COMPARED TO PLACEBO CONTROL
Switching from originator to biosimilar infliximab (IFX) is effective and safe. However, data on multiple switching are scarce. The Edinburgh IBD unit has undertaken three switch programmes: (1) Remicade to CT-P13 (2016), (2) CT-P13 to SB2 (2020), and (3) SB2 to CT-P13 (2021)…