Cargando…

Interpretable and accurate prediction models for metagenomics data

BACKGROUND: Microbiome biomarker discovery for patient diagnosis, prognosis, and risk evaluation is attracting broad interest. Selected groups of microbial features provide signatures that characterize host disease states such as cancer or cardio-metabolic diseases. Yet, the current predictive model...

Descripción completa

Detalles Bibliográficos
Autores principales: Prifti, Edi, Chevaleyre, Yann, Hanczar, Blaise, Belda, Eugeni, Danchin, Antoine, Clément, Karine, Zucker, Jean-Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7062144/
https://www.ncbi.nlm.nih.gov/pubmed/32150601
http://dx.doi.org/10.1093/gigascience/giaa010
_version_ 1783504492358008832
author Prifti, Edi
Chevaleyre, Yann
Hanczar, Blaise
Belda, Eugeni
Danchin, Antoine
Clément, Karine
Zucker, Jean-Daniel
author_facet Prifti, Edi
Chevaleyre, Yann
Hanczar, Blaise
Belda, Eugeni
Danchin, Antoine
Clément, Karine
Zucker, Jean-Daniel
author_sort Prifti, Edi
collection PubMed
description BACKGROUND: Microbiome biomarker discovery for patient diagnosis, prognosis, and risk evaluation is attracting broad interest. Selected groups of microbial features provide signatures that characterize host disease states such as cancer or cardio-metabolic diseases. Yet, the current predictive models stemming from machine learning still behave as black boxes and seldom generalize well. Their interpretation is challenging for physicians and biologists, which makes them difficult to trust and use routinely in the physician–patient decision-making process. Novel methods that provide interpretability and biological insight are needed. Here, we introduce “predomics”, an original machine learning approach inspired by microbial ecosystem interactions that is tailored for metagenomics data. It discovers accurate predictive signatures and provides unprecedented interpretability. The decision provided by the predictive model is based on a simple, yet powerful score computed by adding, subtracting, or dividing cumulative abundance of microbiome measurements. RESULTS: Tested on >100 datasets, we demonstrate that predomics models are simple and highly interpretable. Even with such simplicity, they are at least as accurate as state-of-the-art methods. The family of best models, discovered during the learning process, offers the ability to distil biological information and to decipher the predictability signatures of the studied condition. In a proof-of-concept experiment, we successfully predicted body corpulence and metabolic improvement after bariatric surgery using pre-surgery microbiome data. CONCLUSIONS: Predomics is a new algorithm that helps in providing reliable and trustworthy diagnostic decisions in the microbiome field. Predomics is in accord with societal and legal requirements that plead for an explainable artificial intelligence approach in the medical field.
format Online
Article
Text
id pubmed-7062144
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-70621442020-03-13 Interpretable and accurate prediction models for metagenomics data Prifti, Edi Chevaleyre, Yann Hanczar, Blaise Belda, Eugeni Danchin, Antoine Clément, Karine Zucker, Jean-Daniel Gigascience Research BACKGROUND: Microbiome biomarker discovery for patient diagnosis, prognosis, and risk evaluation is attracting broad interest. Selected groups of microbial features provide signatures that characterize host disease states such as cancer or cardio-metabolic diseases. Yet, the current predictive models stemming from machine learning still behave as black boxes and seldom generalize well. Their interpretation is challenging for physicians and biologists, which makes them difficult to trust and use routinely in the physician–patient decision-making process. Novel methods that provide interpretability and biological insight are needed. Here, we introduce “predomics”, an original machine learning approach inspired by microbial ecosystem interactions that is tailored for metagenomics data. It discovers accurate predictive signatures and provides unprecedented interpretability. The decision provided by the predictive model is based on a simple, yet powerful score computed by adding, subtracting, or dividing cumulative abundance of microbiome measurements. RESULTS: Tested on >100 datasets, we demonstrate that predomics models are simple and highly interpretable. Even with such simplicity, they are at least as accurate as state-of-the-art methods. The family of best models, discovered during the learning process, offers the ability to distil biological information and to decipher the predictability signatures of the studied condition. In a proof-of-concept experiment, we successfully predicted body corpulence and metabolic improvement after bariatric surgery using pre-surgery microbiome data. CONCLUSIONS: Predomics is a new algorithm that helps in providing reliable and trustworthy diagnostic decisions in the microbiome field. Predomics is in accord with societal and legal requirements that plead for an explainable artificial intelligence approach in the medical field. Oxford University Press 2020-03-09 /pmc/articles/PMC7062144/ /pubmed/32150601 http://dx.doi.org/10.1093/gigascience/giaa010 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Prifti, Edi
Chevaleyre, Yann
Hanczar, Blaise
Belda, Eugeni
Danchin, Antoine
Clément, Karine
Zucker, Jean-Daniel
Interpretable and accurate prediction models for metagenomics data
title Interpretable and accurate prediction models for metagenomics data
title_full Interpretable and accurate prediction models for metagenomics data
title_fullStr Interpretable and accurate prediction models for metagenomics data
title_full_unstemmed Interpretable and accurate prediction models for metagenomics data
title_short Interpretable and accurate prediction models for metagenomics data
title_sort interpretable and accurate prediction models for metagenomics data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7062144/
https://www.ncbi.nlm.nih.gov/pubmed/32150601
http://dx.doi.org/10.1093/gigascience/giaa010
work_keys_str_mv AT priftiedi interpretableandaccuratepredictionmodelsformetagenomicsdata
AT chevaleyreyann interpretableandaccuratepredictionmodelsformetagenomicsdata
AT hanczarblaise interpretableandaccuratepredictionmodelsformetagenomicsdata
AT beldaeugeni interpretableandaccuratepredictionmodelsformetagenomicsdata
AT danchinantoine interpretableandaccuratepredictionmodelsformetagenomicsdata
AT clementkarine interpretableandaccuratepredictionmodelsformetagenomicsdata
AT zuckerjeandaniel interpretableandaccuratepredictionmodelsformetagenomicsdata