Cargando…

Multiple-Disease Detection and Classification across Cohorts via Microbiome Search

Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via their out...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Xiaoquan, Jing, Gongchao, Sun, Zheng, Liu, Lu, Xu, Zhenjiang, McDonald, Daniel, Wang, Zengbin, Wang, Honglei, Gonzalez, Antonio, Zhang, Yufeng, Huang, Shi, Huttley, Gavin, Knight, Rob, Xu, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380586/
https://www.ncbi.nlm.nih.gov/pubmed/32184368
http://dx.doi.org/10.1128/mSystems.00150-20
Descripción
Sumario:Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares these to databases of samples from patients. Our strategy’s precision, sensitivity, and speed outperform model-based approaches. In addition, it is more robust to platform heterogeneity and to contamination in 16S rRNA gene amplicon data sets. This search-based strategy shows promise as an important first step in microbiome big-data-based diagnosis. IMPORTANCE Here, we present a search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares them to databases of samples from patients. This approach enables the identification of microbiome states associated with disease even in the presence of different cohorts, multiple sequencing platforms, or significant contamination.