Cargando…

Assessing reproducibility of matrix factorization methods in independent transcriptomes

MOTIVATION: Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have never been compared based on the between-datasets reproducibility of their outputs in similar independent dat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cantini, Laura, Kairov, Ulykbek, de Reyniès, Aurélien, Barillot, Emmanuel, Radvanyi, François, Zinovyev, Andrei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6821374/ https://www.ncbi.nlm.nih.gov/pubmed/30938767 http://dx.doi.org/10.1093/bioinformatics/btz225

_version_	1783464130862120960
author	Cantini, Laura Kairov, Ulykbek de Reyniès, Aurélien Barillot, Emmanuel Radvanyi, François Zinovyev, Andrei
author_facet	Cantini, Laura Kairov, Ulykbek de Reyniès, Aurélien Barillot, Emmanuel Radvanyi, François Zinovyev, Andrei
author_sort	Cantini, Laura
collection	PubMed
description	MOTIVATION: Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have never been compared based on the between-datasets reproducibility of their outputs in similar independent datasets. Lack of this knowledge might have a crucial impact when generalizing the predictions made in a study to others. RESULTS: We systematically test widely used MF methods on several transcriptomic datasets collected from the same cancer type (14 colorectal, 8 breast and 4 ovarian cancer transcriptomic datasets). Inspired by concepts of evolutionary bioinformatics, we design a novel framework based on Reciprocally Best Hit (RBH) graphs in order to benchmark the MF methods for their ability to produce generalizable components. We show that a particular protocol of application of independent component analysis (ICA), accompanied by a stabilization procedure, leads to a significant increase in the between-datasets reproducibility. Moreover, we show that the signals detected through this method are systematically more interpretable than those of other standard methods. We developed a user-friendly tool for performing the Stabilized ICA-based RBH meta-analysis. We apply this methodology to the study of colorectal cancer (CRC) for which 14 independent transcriptomic datasets can be collected. The resulting RBH graph maps the landscape of interconnected factors associated to biological processes or to technological artifacts. These factors can be used as clinical biomarkers or robust and tumor-type specific transcriptomic signatures of tumoral cells or tumoral microenvironment. Their intensities in different samples shed light on the mechanistic basis of CRC molecular subtyping. AVAILABILITY AND IMPLEMENTATION: The RBH construction tool is available from http://goo.gl/DzpwYp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-6821374
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-68213742019-11-04 Assessing reproducibility of matrix factorization methods in independent transcriptomes Cantini, Laura Kairov, Ulykbek de Reyniès, Aurélien Barillot, Emmanuel Radvanyi, François Zinovyev, Andrei Bioinformatics Original Papers MOTIVATION: Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have never been compared based on the between-datasets reproducibility of their outputs in similar independent datasets. Lack of this knowledge might have a crucial impact when generalizing the predictions made in a study to others. RESULTS: We systematically test widely used MF methods on several transcriptomic datasets collected from the same cancer type (14 colorectal, 8 breast and 4 ovarian cancer transcriptomic datasets). Inspired by concepts of evolutionary bioinformatics, we design a novel framework based on Reciprocally Best Hit (RBH) graphs in order to benchmark the MF methods for their ability to produce generalizable components. We show that a particular protocol of application of independent component analysis (ICA), accompanied by a stabilization procedure, leads to a significant increase in the between-datasets reproducibility. Moreover, we show that the signals detected through this method are systematically more interpretable than those of other standard methods. We developed a user-friendly tool for performing the Stabilized ICA-based RBH meta-analysis. We apply this methodology to the study of colorectal cancer (CRC) for which 14 independent transcriptomic datasets can be collected. The resulting RBH graph maps the landscape of interconnected factors associated to biological processes or to technological artifacts. These factors can be used as clinical biomarkers or robust and tumor-type specific transcriptomic signatures of tumoral cells or tumoral microenvironment. Their intensities in different samples shed light on the mechanistic basis of CRC molecular subtyping. AVAILABILITY AND IMPLEMENTATION: The RBH construction tool is available from http://goo.gl/DzpwYp SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-11-01 2019-04-02 /pmc/articles/PMC6821374/ /pubmed/30938767 http://dx.doi.org/10.1093/bioinformatics/btz225 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Original Papers Cantini, Laura Kairov, Ulykbek de Reyniès, Aurélien Barillot, Emmanuel Radvanyi, François Zinovyev, Andrei Assessing reproducibility of matrix factorization methods in independent transcriptomes
title	Assessing reproducibility of matrix factorization methods in independent transcriptomes
title_full	Assessing reproducibility of matrix factorization methods in independent transcriptomes
title_fullStr	Assessing reproducibility of matrix factorization methods in independent transcriptomes
title_full_unstemmed	Assessing reproducibility of matrix factorization methods in independent transcriptomes
title_short	Assessing reproducibility of matrix factorization methods in independent transcriptomes
title_sort	assessing reproducibility of matrix factorization methods in independent transcriptomes
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6821374/ https://www.ncbi.nlm.nih.gov/pubmed/30938767 http://dx.doi.org/10.1093/bioinformatics/btz225
work_keys_str_mv	AT cantinilaura assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes AT kairovulykbek assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes AT dereyniesaurelien assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes AT barillotemmanuel assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes AT radvanyifrancois assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes AT zinovyevandrei assessingreproducibilityofmatrixfactorizationmethodsinindependenttranscriptomes

Assessing reproducibility of matrix factorization methods in independent transcriptomes

Ejemplares similares