Cargando…

A multivariate approach to the integration of multi-omics datasets

BACKGROUND: To leverage the potential of multi-omics studies, exploratory data analysis methods that provide systematic integration and comparison of multiple layers of omics information are required. We describe multiple co-inertia analysis (MCIA), an exploratory data analysis method that identifie...

Descripción completa

Detalles Bibliográficos
Autores principales:	Meng, Chen, Kuster, Bernhard, Culhane, Aedín C, Gholami, Amin Moghaddas
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053266/ https://www.ncbi.nlm.nih.gov/pubmed/24884486 http://dx.doi.org/10.1186/1471-2105-15-162

_version_	1782320344872255488
author	Meng, Chen Kuster, Bernhard Culhane, Aedín C Gholami, Amin Moghaddas
author_facet	Meng, Chen Kuster, Bernhard Culhane, Aedín C Gholami, Amin Moghaddas
author_sort	Meng, Chen
collection	PubMed
description	BACKGROUND: To leverage the potential of multi-omics studies, exploratory data analysis methods that provide systematic integration and comparison of multiple layers of omics information are required. We describe multiple co-inertia analysis (MCIA), an exploratory data analysis method that identifies co-relationships between multiple high dimensional datasets. Based on a covariance optimization criterion, MCIA simultaneously projects several datasets into the same dimensional space, transforming diverse sets of features onto the same scale, to extract the most variant from each dataset and facilitate biological interpretation and pathway analysis. RESULTS: We demonstrate integration of multiple layers of information using MCIA, applied to two typical “omics” research scenarios. The integration of transcriptome and proteome profiles of cells in the NCI-60 cancer cell line panel revealed distinct, complementary features, which together increased the coverage and power of pathway analysis. Our analysis highlighted the importance of the leukemia extravasation signaling pathway in leukemia that was not highly ranked in the analysis of any individual dataset. Secondly, we compared transcriptome profiles of high grade serous ovarian tumors that were obtained, on two different microarray platforms and next generation RNA-sequencing, to identify the most informative platform and extract robust biomarkers of molecular subtypes. We discovered that the variance of RNA-sequencing data processed using RPKM had greater variance than that with MapSplice and RSEM. We provided novel markers highly associated to tumor molecular subtype combined from four data platforms. MCIA is implemented and available in the R/Bioconductor “omicade4” package. CONCLUSION: We believe MCIA is an attractive method for data integration and visualization of several datasets of multi-omics features observed on the same set of individuals. The method is not dependent on feature annotation, and thus it can extract important features even when there are not present across all datasets. MCIA provides simple graphical representations for the identification of relationships between large datasets.
format	Online Article Text
id	pubmed-4053266
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-40532662014-06-20 A multivariate approach to the integration of multi-omics datasets Meng, Chen Kuster, Bernhard Culhane, Aedín C Gholami, Amin Moghaddas BMC Bioinformatics Methodology Article BACKGROUND: To leverage the potential of multi-omics studies, exploratory data analysis methods that provide systematic integration and comparison of multiple layers of omics information are required. We describe multiple co-inertia analysis (MCIA), an exploratory data analysis method that identifies co-relationships between multiple high dimensional datasets. Based on a covariance optimization criterion, MCIA simultaneously projects several datasets into the same dimensional space, transforming diverse sets of features onto the same scale, to extract the most variant from each dataset and facilitate biological interpretation and pathway analysis. RESULTS: We demonstrate integration of multiple layers of information using MCIA, applied to two typical “omics” research scenarios. The integration of transcriptome and proteome profiles of cells in the NCI-60 cancer cell line panel revealed distinct, complementary features, which together increased the coverage and power of pathway analysis. Our analysis highlighted the importance of the leukemia extravasation signaling pathway in leukemia that was not highly ranked in the analysis of any individual dataset. Secondly, we compared transcriptome profiles of high grade serous ovarian tumors that were obtained, on two different microarray platforms and next generation RNA-sequencing, to identify the most informative platform and extract robust biomarkers of molecular subtypes. We discovered that the variance of RNA-sequencing data processed using RPKM had greater variance than that with MapSplice and RSEM. We provided novel markers highly associated to tumor molecular subtype combined from four data platforms. MCIA is implemented and available in the R/Bioconductor “omicade4” package. CONCLUSION: We believe MCIA is an attractive method for data integration and visualization of several datasets of multi-omics features observed on the same set of individuals. The method is not dependent on feature annotation, and thus it can extract important features even when there are not present across all datasets. MCIA provides simple graphical representations for the identification of relationships between large datasets. BioMed Central 2014-05-29 /pmc/articles/PMC4053266/ /pubmed/24884486 http://dx.doi.org/10.1186/1471-2105-15-162 Text en Copyright © 2014 Meng et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Meng, Chen Kuster, Bernhard Culhane, Aedín C Gholami, Amin Moghaddas A multivariate approach to the integration of multi-omics datasets
title	A multivariate approach to the integration of multi-omics datasets
title_full	A multivariate approach to the integration of multi-omics datasets
title_fullStr	A multivariate approach to the integration of multi-omics datasets
title_full_unstemmed	A multivariate approach to the integration of multi-omics datasets
title_short	A multivariate approach to the integration of multi-omics datasets
title_sort	multivariate approach to the integration of multi-omics datasets
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4053266/ https://www.ncbi.nlm.nih.gov/pubmed/24884486 http://dx.doi.org/10.1186/1471-2105-15-162
work_keys_str_mv	AT mengchen amultivariateapproachtotheintegrationofmultiomicsdatasets AT kusterbernhard amultivariateapproachtotheintegrationofmultiomicsdatasets AT culhaneaedinc amultivariateapproachtotheintegrationofmultiomicsdatasets AT gholamiaminmoghaddas amultivariateapproachtotheintegrationofmultiomicsdatasets AT mengchen multivariateapproachtotheintegrationofmultiomicsdatasets AT kusterbernhard multivariateapproachtotheintegrationofmultiomicsdatasets AT culhaneaedinc multivariateapproachtotheintegrationofmultiomicsdatasets AT gholamiaminmoghaddas multivariateapproachtotheintegrationofmultiomicsdatasets

A multivariate approach to the integration of multi-omics datasets

Ejemplares similares