Society: ASGE
INTRODUCTION: Colonoscopy reduces colorectal cancer mortality via the identification and removal of neoplastic polyps. In clinical trials, computer aided detection (CADe) improves polyp detection, but there is limited data of CADe implementation in routine practice. We aimed to assess the impact of CADe upon polyp detection in a large cohort of high-volume colonoscopists.
METHODS: A CADe system (GI Genius, Medtronic) system was implemented in staggered fashion in a single large academic medical center to pragmatically assess its impact over a 6-month period (March 2022 to August 2022). Four CADe units were placed in a twelve-room endoscopy unit where colonoscopists rotate through different rooms. Thus, a colonoscopist may be able to utilize CADe when performing colonoscopy on one day (“CADe room”) but perform colonoscopy in a room without CADe the next day (“non-CADe room”). Colonoscopists who performed at least 100 colonoscopies over the 6-month period were included in this analysis. Colonoscopists were encouraged but not mandated to utilize CADe. The primary outcome was screening and surveillance colonoscopy polypectomy rate. Secondary outcomes were screening colonoscopy adenoma detection rate (ADR) and serrated detection rate (SDR). Results were further stratified by self-reported utilization of CADe: CADe majority users (self-reported use in > 50% of cases) and CADe minority users (self-reported use in < 50% of cases).
RESULTS: Over the 6-month study period, 21 colonoscopists performed 4,820 colonoscopies (Screening: 2,459, Surveillance: 1,472, and Diagnostic: 889). Of 21 colonoscopists, 9 were CADe majority users. Screening and surveillance polypectomy rates significantly increased in CADe rooms compared to non-CADe rooms (60.5% versus 51.7%, p<0.0001; Table). When stratified by CADe use, CADe majority users had a significant increase in polypectomy rate in CADe compared to non-CADe rooms (66.5% versus 53.4%, p<0.0001); in contrast, CADe minority users did not have a significant increase in polypectomy in CADe compared to non-CADe rooms (54.3% versus 50.4%, p=0.2).
When CADe was available, screening colonoscopy ADR (50.6% versus 41.6%, p<0.0002) and SDR (19.4% versus 14.7%, P=0.006) significantly increased. However, as expected, this significant increase in ADR and SDR was only seen in CADe majority users but not minority users.
DISCUSSION: In this pragmatic assessment of the impact of CADe upon colonoscopy quality, CADe significantly increased polypectomy rates for both screening and surveillance colonoscopy as well as screening colonoscopy ADR and SDR. As the impact of CADe is somewhat blunted by only half of colonoscopists using CADe in a majority of cases, further work is needed to improve CADe utilization in practice.
ACKNOWLEDGEMENTS: Nives and Joseph Rizza and the Digestive Health Foundation for their generous gifts to support AI research.

Impact of CADe upon polypectomy rate, adenoma detection rate (ADR), and serrated detection rate (SDR). Notably, the impact is seen only in colonoscopists who self-report using CADe in a majority of their cases.
Aims. Since sessile serrated lesions (SSL) were introduced, multiple AI studies have tried to classify them alongside hyperplastic and adenomatous polyps, with varying results. A major hurdle for the classification of SSL is the low prevalence and difficult endoscopic recognition. Furthermore, the classification based on histology can also be subjective, as demonstrated by the inter-rater variability amongst pathologists. This study aims to improve the baseline training method for classification by training multiple models on subtasks before taking an ensemble vote for final classification.
Methods. 784 unique polyps (24% hyperplastic, 69% adenomas and 7% sessile serrated polyps) were recorded in different endoscopic imaging modalities as white light, blue light imaging and linked-color imaging. The ground truth was based on the histology of the polyp, assessed as hyperplastic (hyp), adenoma (adn) or SSL. The videos containing on average 125 frames were split into training, validation and test sets without overlapping patients to remove any possible data contamination. Subtasks such as 1-vs-all and 1-vs-1 strategies were trained on each of the class combinations. The outputs of the subtasks were used to vote the outcome using different combinations of the subtasks. The results were compared to a model that directly classified the three classes (hyp-vs-adn-vs-ssl).
Results. The averaged frame-based accuracy, sensitivity and positive predictive value (PPV) per class are shown in the tables. Table 1 shows the mean and standard deviation over the three classes for a selection of ensemble model. Table 2 shows the results per class for all ensemble models. An improvement can be seen comparing the results of the ensemble models with the baseline (table 1). There is a large improvement of both sensitivity and PPV for SSL (see table 2). there are slight variations in the metrics per class depending on the choice of ensemble.
Conclusions. The proposed method for ensemble voting is a valid approach for improving results for characterizing sessile serrated lesions with AI. The ensemble models have similar metrics, with slight variations depending on the choice of subtasks.

Table 1 Selection of the best ensemble models per grouping type compared to the baseline (hyp-vs-adn-vs-ssl)
Table 2 Overview of the results per ensemble model for each class. Both sensitivity and PPV of the SSL increases using ensemble models