Background: Coupled with well-characterized host genetic susceptibilities and environmental risk factors, alterations of gut microbial communities contribute to risk and severity of the inflammatory bowel diseases (IBD) and their subtypes, Crohn’s disease (CD) and ulcerative colitis (UC). However, beyond descriptions of broad departures from more typical ecology, the identification of a reproducible IBD-associated microbial configuration has been elusive, in part due to the personalization of the gut microbiome, difficulties assembling diverse multinational cohorts, varied molecular and computational methods, and the statistical power required to overcome these challenges.
Methods: We applied a meta-analytical framework for cross-cohort population-scale interrogation of the gut microbiome to identify consistent metagenomic signatures of IBD. We curated and uniformly pre-processed 2,371 stool metagenomes from 542 individuals with IBD and their referent non-IBD counterparts from the United States, Canada, and Europe among all seven cohorts in the The Human Microbiome Bioactives Resource (Fig. A & B).
Results: We performed feature-level (i.e., microbial taxa, functional pathways, gene families) meta-analyses using multivariable linear mixed effects models with joint normalization of taxa abundances for batch/cohort effects, while adjusting for age, sex, antibiotic use, and random subject intercepts for within-subject correlations. Differentially abundant effects were further synthesized across cohorts with random effects modeling. This identified 84 consistently differentially abundant species, including 13 novel associations that did not achieve statistical significance within any one study, firmly establishing the rationale for a meta-analytic approach to population structure discovery (PFDR<0.05; Fig. C). We systematically identified the mass expansion of oral-predominant taxa in the IBD gut, such as Veillonella and Streptococcus (Fig. D). Further, we observed disease-specific shifts in cobalamin biosynthesis pathways, a likely consequence of small bowel dysfunction in CD, but not UC (not shown). Finally, using complementary phylogenetic and gene-based modeling, we observed novel differences in the carriage of gene cassettes linked to inflammation and substrate utilization among subclades present in IBD compared to the same taxa found in non-IBD, suggesting that strain-specific functional differences may contribute to disease-related bacterial fitness even when higher-order species-level differential abundance was more modest.
Conclusion: Cross-cohort heterogeneity can be overcome using meta-analytic approaches to yield consistent and confident findings on the complex interplay between highly personalized gut microbial communities and IBD.

Large-scale international effort to uncover a more canonical IBD microbiome. (A) Baseline characteristics. (B) Species abundance by cohort found universally (i.e., in all seven), overlapping (in more than 1 cohort), or solely in one. (C) Summary statistics colored by PFDR for IBD vs. non-IBD & CD vs. UC. Cohort-level heatmap (right) with broad agreement in effect & direction not always statistically significant in any given study. Circles represent study weight, and asterisks intracohort significance (*=PFDR<0.1, **=PFDR<0.05). Right heatmap displays results for confounders. (D) Enrichment of oral microbes, including aerotolerant Veillonella and Streptococci, plotted along the x-axis (differences in relative abundance between oral and gut sites in HMP1-II) vs. y-axis with IBD vs. non-IBD β-coefficients. Points colored by PFDR. (E) IBD strains can be distinguished from those in non-IBD.