Background
Anti-granulocyte/macrophage-colony stimulating factor autoantibody (GM-CSF AuAB) has been identified before the onset of Crohn’s Disease (CD) and is associated with an increase in complicated disease. However, its connection to the gut microbiome, a key factor in CD progression, remains unclear.
Aim
We aimed to utilize the RISK Stratification Study (designed to identify diagnostic predictors of complications in pediatric CD at diagnosis) to assess the relationship between GM-CSF AuAB, the gut microbiome, and disease phenotype.
Methods
We profiled the gut microbiome across multi-kingdoms (bacteria, archaea, fungi, and DNA viruses) using existing amplicon and whole metagenome sequencing data of stool samples collected during the patient’s first visit. To identify biomarkers of stricturing (B2) and penetrating (B3) complications, we compared patients across three states: B1 (inflammatory disease) to pre- and established-B2/B3 (Fig. a) with multivariate ANOVA tests while controlling for demographic and clinical confounders. Finally, we established Random Forest models to predict the onset of CD complications.
Results
By leveraging comprehensive microbiome profiles of 282 new-onset pediatric CD patients, we identified biomarkers for both B2 and B3 complications. Microbiome differences for B2 included Ruminococcus gnavus and ten metabolic pathways related to mucosal barrier function and cellular integrity. For B3, significant changes were observed in Proteobacteria, along with 19 pathways that may increase cellular stress and DNA damage. Moreover, we identified a different set of viruses (but not archaea or fungi) associated with each complication. Further cross-comparison revealed few overlaps between the biomarkers for B2 and B3, suggesting independent mechanisms for each complication. Subsequently, we found that the prediction of CD complications was significantly enhanced when integrating GM-CSF AuAB and other serological information with gut microbiome data (with AUC=0.83 for B2 and 0.90 for B3). DNA viruses provided consistently superior predictive value over bacteria, archaea, and fungi (Fig. b,c). In addition, we found three clinical factors— GM-CSF AuAB and ASCA IgG/IgA —consistently increased from B1 to pre- and established-B2/B3. Further correlation analysis revealed that only GM-CSF AuAB was significantly associated with microbial functions’ alpha diversity (R=0.124, p=0.039) and composition (PERMANOVA Test, p=0.022), as well as with the composition of DNA viruses (p=0.038).
Conclusion
Our findings indicate a significant contribution of DNA virus data in predicting CD complications and its association with GM-CSF AuAB. This highlights a critical potential link between GM-CSF AuAB and viruses in developing CD complications that could be leveraged to advance clinical prediction and deserve validation.

Figure caption. (a) Definition of B1, pre-, and established-B2/B3, along with the number of individuals categorized under each status in cases of B2 and B3 complications, as analyzed in our microbiome data. (b) Prediction of B2 using different data types. To mitigate the effects of randomness in the modeling process, we increased the number of top contributive features and evaluated their multi-class AUC for performance. The graph demonstrates that using all features (both microbiome and serological information) yields the best predictive performance. For single data types, DNA viruses show the highest predictive accuracy. (c) Prediction of B3 using different data types. Regarding predictive accuracy, the most contributive data type is still DNA viruses. We hypothesize that the accuracy of predicting complications using single-type data reflects the strength of its relationship with the occurrence of complications.