Cargando…

Multivariate multi-way analysis of multi-source data

Motivation: Analysis of variance (ANOVA)-type methods are the default tool for the analysis of data with multiple covariates. These tools have been generalized to the multivariate analysis of high-throughput biological datasets, where the main challenge is the problem of small sample size and high d...

Descripción completa

Detalles Bibliográficos
Autores principales: Huopaniemi, Ilkka, Suvitaival, Tommi, Nikkilä, Janne, Orešič, Matej, Kaski, Samuel
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881359/
https://www.ncbi.nlm.nih.gov/pubmed/20529933
http://dx.doi.org/10.1093/bioinformatics/btq174
_version_ 1782182105863684096
author Huopaniemi, Ilkka
Suvitaival, Tommi
Nikkilä, Janne
Orešič, Matej
Kaski, Samuel
author_facet Huopaniemi, Ilkka
Suvitaival, Tommi
Nikkilä, Janne
Orešič, Matej
Kaski, Samuel
author_sort Huopaniemi, Ilkka
collection PubMed
description Motivation: Analysis of variance (ANOVA)-type methods are the default tool for the analysis of data with multiple covariates. These tools have been generalized to the multivariate analysis of high-throughput biological datasets, where the main challenge is the problem of small sample size and high dimensionality. However, the existing multi-way analysis methods are not designed for the currently increasingly important experiments where data is obtained from multiple sources. Common examples of such settings include integrated analysis of metabolic and gene expression profiles, or metabolic profiles from several tissues in our case, in a controlled multi-way experimental setup where disease status, medical treatment, gender and time-series are usual covariates. Results: We extend the applicability area of multivariate, multi-way ANOVA-type methods to multi-source cases by introducing a novel Bayesian model. The method is capable of finding covariate-related dependencies between the sources. It assumes the measurements consist of groups of similarly behaving variables, and estimates the multivariate covariate effects and their interaction effects for the discovered groups of variables. In particular, the method partitions the effects to those shared between the sources and to source-specific ones. The method is specifically designed for datasets with small sample sizes and high dimensionality. We apply the method to a lipidomics dataset from a lung cancer study with two-way experimental setup, where measurements from several tissues with mostly distinct lipids have been taken. The method is also directly applicable to gene expression and proteomics. Availability: An R-implementation is available at http://www.cis.hut.fi/projects/mi/software/multiWayCCA/ Contact: ilkka.huopaniemi@tkk.fi; samuel.kaski@tkk.fi
format Text
id pubmed-2881359
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28813592010-06-08 Multivariate multi-way analysis of multi-source data Huopaniemi, Ilkka Suvitaival, Tommi Nikkilä, Janne Orešič, Matej Kaski, Samuel Bioinformatics Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa Motivation: Analysis of variance (ANOVA)-type methods are the default tool for the analysis of data with multiple covariates. These tools have been generalized to the multivariate analysis of high-throughput biological datasets, where the main challenge is the problem of small sample size and high dimensionality. However, the existing multi-way analysis methods are not designed for the currently increasingly important experiments where data is obtained from multiple sources. Common examples of such settings include integrated analysis of metabolic and gene expression profiles, or metabolic profiles from several tissues in our case, in a controlled multi-way experimental setup where disease status, medical treatment, gender and time-series are usual covariates. Results: We extend the applicability area of multivariate, multi-way ANOVA-type methods to multi-source cases by introducing a novel Bayesian model. The method is capable of finding covariate-related dependencies between the sources. It assumes the measurements consist of groups of similarly behaving variables, and estimates the multivariate covariate effects and their interaction effects for the discovered groups of variables. In particular, the method partitions the effects to those shared between the sources and to source-specific ones. The method is specifically designed for datasets with small sample sizes and high dimensionality. We apply the method to a lipidomics dataset from a lung cancer study with two-way experimental setup, where measurements from several tissues with mostly distinct lipids have been taken. The method is also directly applicable to gene expression and proteomics. Availability: An R-implementation is available at http://www.cis.hut.fi/projects/mi/software/multiWayCCA/ Contact: ilkka.huopaniemi@tkk.fi; samuel.kaski@tkk.fi Oxford University Press 2010-06-15 2010-06-01 /pmc/articles/PMC2881359/ /pubmed/20529933 http://dx.doi.org/10.1093/bioinformatics/btq174 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
Huopaniemi, Ilkka
Suvitaival, Tommi
Nikkilä, Janne
Orešič, Matej
Kaski, Samuel
Multivariate multi-way analysis of multi-source data
title Multivariate multi-way analysis of multi-source data
title_full Multivariate multi-way analysis of multi-source data
title_fullStr Multivariate multi-way analysis of multi-source data
title_full_unstemmed Multivariate multi-way analysis of multi-source data
title_short Multivariate multi-way analysis of multi-source data
title_sort multivariate multi-way analysis of multi-source data
topic Ismb 2010 Conference Proceedings July 11 to July 13, 2010, Boston, Ma, Usa
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2881359/
https://www.ncbi.nlm.nih.gov/pubmed/20529933
http://dx.doi.org/10.1093/bioinformatics/btq174
work_keys_str_mv AT huopaniemiilkka multivariatemultiwayanalysisofmultisourcedata
AT suvitaivaltommi multivariatemultiwayanalysisofmultisourcedata
AT nikkilajanne multivariatemultiwayanalysisofmultisourcedata
AT oresicmatej multivariatemultiwayanalysisofmultisourcedata
AT kaskisamuel multivariatemultiwayanalysisofmultisourcedata