Cargando…
Evaluating supervised and unsupervised background noise correction in human gut microbiome data
The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as se...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8853548/ https://www.ncbi.nlm.nih.gov/pubmed/35130266 http://dx.doi.org/10.1371/journal.pcbi.1009838 |
_version_ | 1784653254422953984 |
---|---|
author | Briscoe, Leah Balliu, Brunilda Sankararaman, Sriram Halperin, Eran Garud, Nandita R. |
author_facet | Briscoe, Leah Balliu, Brunilda Sankararaman, Sriram Halperin, Eran Garud, Nandita R. |
author_sort | Briscoe, Leah |
collection | PubMed |
description | The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as sequencing protocol, which can make it difficult to predict phenotype and find biomarkers of disease. Supervised methods to correct for background noise, originally designed for gene expression and RNA-seq data, are commonly applied to microbiome data but may be limited because they cannot account for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach that is presently used in other domains but has not been applied to microbiome data to date. We find that the unsupervised principal component correction approach has comparable ability in reducing false discovery of biomarkers as the supervised approaches, with the added benefit of not needing to know the sources of variation apriori. However, in prediction tasks, it appears to only improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses. |
format | Online Article Text |
id | pubmed-8853548 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-88535482022-02-18 Evaluating supervised and unsupervised background noise correction in human gut microbiome data Briscoe, Leah Balliu, Brunilda Sankararaman, Sriram Halperin, Eran Garud, Nandita R. PLoS Comput Biol Research Article The ability to predict human phenotypes and identify biomarkers of disease from metagenomic data is crucial for the development of therapeutics for microbiome-associated diseases. However, metagenomic data is commonly affected by technical variables unrelated to the phenotype of interest, such as sequencing protocol, which can make it difficult to predict phenotype and find biomarkers of disease. Supervised methods to correct for background noise, originally designed for gene expression and RNA-seq data, are commonly applied to microbiome data but may be limited because they cannot account for unmeasured sources of variation. Unsupervised approaches address this issue, but current methods are limited because they are ill-equipped to deal with the unique aspects of microbiome data, which is compositional, highly skewed, and sparse. We perform a comparative analysis of the ability of different denoising transformations in combination with supervised correction methods as well as an unsupervised principal component correction approach that is presently used in other domains but has not been applied to microbiome data to date. We find that the unsupervised principal component correction approach has comparable ability in reducing false discovery of biomarkers as the supervised approaches, with the added benefit of not needing to know the sources of variation apriori. However, in prediction tasks, it appears to only improve prediction when technical variables contribute to the majority of variance in the data. As new and larger metagenomic datasets become increasingly available, background noise correction will become essential for generating reproducible microbiome analyses. Public Library of Science 2022-02-07 /pmc/articles/PMC8853548/ /pubmed/35130266 http://dx.doi.org/10.1371/journal.pcbi.1009838 Text en © 2022 Briscoe et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Briscoe, Leah Balliu, Brunilda Sankararaman, Sriram Halperin, Eran Garud, Nandita R. Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title | Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title_full | Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title_fullStr | Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title_full_unstemmed | Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title_short | Evaluating supervised and unsupervised background noise correction in human gut microbiome data |
title_sort | evaluating supervised and unsupervised background noise correction in human gut microbiome data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8853548/ https://www.ncbi.nlm.nih.gov/pubmed/35130266 http://dx.doi.org/10.1371/journal.pcbi.1009838 |
work_keys_str_mv | AT briscoeleah evaluatingsupervisedandunsupervisedbackgroundnoisecorrectioninhumangutmicrobiomedata AT balliubrunilda evaluatingsupervisedandunsupervisedbackgroundnoisecorrectioninhumangutmicrobiomedata AT sankararamansriram evaluatingsupervisedandunsupervisedbackgroundnoisecorrectioninhumangutmicrobiomedata AT halperineran evaluatingsupervisedandunsupervisedbackgroundnoisecorrectioninhumangutmicrobiomedata AT garudnanditar evaluatingsupervisedandunsupervisedbackgroundnoisecorrectioninhumangutmicrobiomedata |