Cargando…

Multitable Methods for Microbiome Data Integration

The simultaneous study of multiple measurement types is a frequently encountered problem in practical data analysis. It is especially common in microbiome research, where several sources of data—for example, 16s-rRNA, metagenomic, metabolomic, or transcriptomic data–can be collected on the same phys...

Descripción completa

Detalles Bibliográficos
Autores principales: Sankaran, Kris, Holmes, Susan P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6724662/
https://www.ncbi.nlm.nih.gov/pubmed/31555316
http://dx.doi.org/10.3389/fgene.2019.00627
_version_ 1783449031008059392
author Sankaran, Kris
Holmes, Susan P.
author_facet Sankaran, Kris
Holmes, Susan P.
author_sort Sankaran, Kris
collection PubMed
description The simultaneous study of multiple measurement types is a frequently encountered problem in practical data analysis. It is especially common in microbiome research, where several sources of data—for example, 16s-rRNA, metagenomic, metabolomic, or transcriptomic data–can be collected on the same physical samples. There has been a proliferation of proposals for analyzing such multitable microbiome data, as is often the case when new data sources become more readily available, facilitating inquiry into new types of scientific questions. However, stepping back from the rush for new methods for multitable analysis in the microbiome literature, it is worthwhile to recognize the broader landscape of multitable methods, as they have been relevant in problem domains ranging across economics, robotics, genomics, chemometrics, and neuroscience. In different contexts, these techniques are called data integration, multi-omic, and multitask methods, for example. Of course, there is no unique optimal algorithm to use across domains—different instances of the multitable problem possess specific structure or variation that are worth incorporating in methodology. Our purpose here is not to develop new algorithms, but rather to 1) distill relevant themes across different analysis approaches and 2) provide concrete workflows for approaching analysis, as a function of ultimate analysis goals and data characteristics (heterogeneity, dimensionality, sparsity). Towards the second goal, we have made code for all analysis and figures available online at https://github.com/krisrs1128/multitable_review.
format Online
Article
Text
id pubmed-6724662
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-67246622019-09-25 Multitable Methods for Microbiome Data Integration Sankaran, Kris Holmes, Susan P. Front Genet Genetics The simultaneous study of multiple measurement types is a frequently encountered problem in practical data analysis. It is especially common in microbiome research, where several sources of data—for example, 16s-rRNA, metagenomic, metabolomic, or transcriptomic data–can be collected on the same physical samples. There has been a proliferation of proposals for analyzing such multitable microbiome data, as is often the case when new data sources become more readily available, facilitating inquiry into new types of scientific questions. However, stepping back from the rush for new methods for multitable analysis in the microbiome literature, it is worthwhile to recognize the broader landscape of multitable methods, as they have been relevant in problem domains ranging across economics, robotics, genomics, chemometrics, and neuroscience. In different contexts, these techniques are called data integration, multi-omic, and multitask methods, for example. Of course, there is no unique optimal algorithm to use across domains—different instances of the multitable problem possess specific structure or variation that are worth incorporating in methodology. Our purpose here is not to develop new algorithms, but rather to 1) distill relevant themes across different analysis approaches and 2) provide concrete workflows for approaching analysis, as a function of ultimate analysis goals and data characteristics (heterogeneity, dimensionality, sparsity). Towards the second goal, we have made code for all analysis and figures available online at https://github.com/krisrs1128/multitable_review. Frontiers Media S.A. 2019-08-28 /pmc/articles/PMC6724662/ /pubmed/31555316 http://dx.doi.org/10.3389/fgene.2019.00627 Text en Copyright © 2019 Sankaran and Holmes http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Sankaran, Kris
Holmes, Susan P.
Multitable Methods for Microbiome Data Integration
title Multitable Methods for Microbiome Data Integration
title_full Multitable Methods for Microbiome Data Integration
title_fullStr Multitable Methods for Microbiome Data Integration
title_full_unstemmed Multitable Methods for Microbiome Data Integration
title_short Multitable Methods for Microbiome Data Integration
title_sort multitable methods for microbiome data integration
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6724662/
https://www.ncbi.nlm.nih.gov/pubmed/31555316
http://dx.doi.org/10.3389/fgene.2019.00627
work_keys_str_mv AT sankarankris multitablemethodsformicrobiomedataintegration
AT holmessusanp multitablemethodsformicrobiomedataintegration