Background Emerging data support the importance of gut microbiome alterations in the pathogenesis of inflammatory bowel disease (IBD) but their role as non-invasive biomarkers for IBD has not been unexplored.
Aim We aim to identify fecal bacterial signatures in ulcerative colitis (UC) and Crohn's disease (CD) and evaluate their diagnostic accuracy in IBD.
Methods A total of 1,456 subjects (445 UC, 500 CD, 511 controls) from three in-house cohorts and three public datasets, covering America, Europe, Asia, and Australia were included. Through metagenomic analysis, UC- and CD-associated bacteria markers were identified and validated across populations. A multiplex droplet digital PCR (m-ddPCR) method was developed for detection of selected bacteria markers. Random forest algorithm was used to construct a diagnostic model. Microbiome functional pathways were profiled using HUMAnN3. Functional dysbiosis score was defined as the median Bray-Curtis dissimilarity to control group.
Results Gut bacteria composition differed significantly between patients with UC, CD and healthy controls (p=0.001). Metagenomic analysis identified ten and nine bacteria species markers that differentiated UC and CD from healthy controls. In discovery cohort, diagnostic performance of models constructed using the bacteria signatures achieved areas under the curve (AUCs) of 0.90 for UC and 0.94 for CD. Validation of diagnostic model using independent cohorts from Hong Kong and Australia, and three public datasets from the United States, Europe, and mainland China showed an average AUC of 0.84 for UC and 0.85 for CD. Additionally, the diagnostic model can distinguish UC from 730 non-IBD patients with or without gastrointestinal disease with an AUC of 0.85, while performance was 0.86 in discriminating CD from these non-IBD patients (Figure1A). Selected bacteria markers quantified by m-ddPCR discriminated UC and CD from controls with AUCs of 0.88 and 0.87, respectively (Figure1B). Functional analysis showed microbial metabolic perturbations in UC and CD (Figure1C). Functional dysbiotic scores, used to evaluate the degree of metabolic dysregulation, were positively correlated with disease risk scores determined by our diagnostic models implying these biomarkers possess the capacity to reflect the metabolic dysfunctions in patients with IBD (Figure1D).
Conclusions We present the first cross-population metagenomic profiling study of IBD fecal microbiomes to discover and validate microbial biomarkers in ethnically different cohorts, and to independently validate selected biomarkers using an affordable clinically relevant technology (ddPCR). This study takes us a step further towards affordable non-invasive early diagnostic biomarkers for IBD from fecal samples.
This work is supported by InnoHK, The Government of Hong Kong, SAR, China and The Leona M. and Harry B. Helmsley Charitable Trust.

Figure 1 (A) Performance of model with ten UC or nine CD selected bacterial species biomarkers for classifying UC or CD patients with healthy controls in discovery cohort, Hong Kong validation cohort, Australian validation cohort, three public datasets and non-IBD patients with or without gastrointestinal disease. (B) Diagnostic performance to discriminate patients with UC or CD from healthy control with ten UC or nine CD bacterial species biomarkers determined by m-ddPCR. (C) Comparison of functional dysbiosis scores determined by median Bray-Curtis dissimilarity between a sample and healthy controls. P values were calculated using the Wilcoxon rank-sum test. (D) Correlation between functional dysbiosis scores and probability of disease generated by model based on ten UC or nine CD bacterial species biomarkers. The correlation coefficient R and p value were given by Spearman correlation. *p<0.05.