Cargando…

Rectified factor networks for biclustering of omics data

MOTIVATION: Biclustering has become a major tool for analyzing large datasets given as matrix of samples times features and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. Factor Analysis for Bicluster Acquisition (FABIA), one of t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Clevert, Djork-Arné, Unterthiner, Thomas, Povysil, Gundula, Hochreiter, Sepp
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2017
Materias:	Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870657/ https://www.ncbi.nlm.nih.gov/pubmed/28881961 http://dx.doi.org/10.1093/bioinformatics/btx226

_version_	1783309527906516992
author	Clevert, Djork-Arné Unterthiner, Thomas Povysil, Gundula Hochreiter, Sepp
author_facet	Clevert, Djork-Arné Unterthiner, Thomas Povysil, Gundula Hochreiter, Sepp
author_sort	Clevert, Djork-Arné
collection	PubMed
description	MOTIVATION: Biclustering has become a major tool for analyzing large datasets given as matrix of samples times features and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. Factor Analysis for Bicluster Acquisition (FABIA), one of the most successful biclustering methods, is a generative model that represents each bicluster by two sparse membership vectors: one for the samples and one for the features. However, FABIA is restricted to about 20 code units because of the high computational complexity of computing the posterior. Furthermore, code units are sometimes insufficiently decorrelated and sample membership is difficult to determine. We propose to use the recently introduced unsupervised Deep Learning approach Rectified Factor Networks (RFNs) to overcome the drawbacks of existing biclustering methods. RFNs efficiently construct very sparse, non-linear, high-dimensional representations of the input via their posterior means. RFN learning is a generalized alternating minimization algorithm based on the posterior regularization method which enforces non-negative and normalized posterior means. Each code unit represents a bicluster, where samples for which the code unit is active belong to the bicluster and features that have activating weights to the code unit belong to the bicluster. RESULTS: On 400 benchmark datasets and on three gene expression datasets with known clusters, RFN outperformed 13 other biclustering methods including FABIA. On data of the 1000 Genomes Project, RFN could identify DNA segments which indicate, that interbreeding with other hominins starting already before ancestors of modern humans left Africa. AVAILABILITY AND IMPLEMENTATION: https://github.com/bioinf-jku/librfn
format	Online Article Text
id	pubmed-5870657
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-58706572018-04-05 Rectified factor networks for biclustering of omics data Clevert, Djork-Arné Unterthiner, Thomas Povysil, Gundula Hochreiter, Sepp Bioinformatics Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017 MOTIVATION: Biclustering has become a major tool for analyzing large datasets given as matrix of samples times features and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. Factor Analysis for Bicluster Acquisition (FABIA), one of the most successful biclustering methods, is a generative model that represents each bicluster by two sparse membership vectors: one for the samples and one for the features. However, FABIA is restricted to about 20 code units because of the high computational complexity of computing the posterior. Furthermore, code units are sometimes insufficiently decorrelated and sample membership is difficult to determine. We propose to use the recently introduced unsupervised Deep Learning approach Rectified Factor Networks (RFNs) to overcome the drawbacks of existing biclustering methods. RFNs efficiently construct very sparse, non-linear, high-dimensional representations of the input via their posterior means. RFN learning is a generalized alternating minimization algorithm based on the posterior regularization method which enforces non-negative and normalized posterior means. Each code unit represents a bicluster, where samples for which the code unit is active belong to the bicluster and features that have activating weights to the code unit belong to the bicluster. RESULTS: On 400 benchmark datasets and on three gene expression datasets with known clusters, RFN outperformed 13 other biclustering methods including FABIA. On data of the 1000 Genomes Project, RFN could identify DNA segments which indicate, that interbreeding with other hominins starting already before ancestors of modern humans left Africa. AVAILABILITY AND IMPLEMENTATION: https://github.com/bioinf-jku/librfn Oxford University Press 2017-07-15 2017-07-12 /pmc/articles/PMC5870657/ /pubmed/28881961 http://dx.doi.org/10.1093/bioinformatics/btx226 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017 Clevert, Djork-Arné Unterthiner, Thomas Povysil, Gundula Hochreiter, Sepp Rectified factor networks for biclustering of omics data
title	Rectified factor networks for biclustering of omics data
title_full	Rectified factor networks for biclustering of omics data
title_fullStr	Rectified factor networks for biclustering of omics data
title_full_unstemmed	Rectified factor networks for biclustering of omics data
title_short	Rectified factor networks for biclustering of omics data
title_sort	rectified factor networks for biclustering of omics data
topic	Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870657/ https://www.ncbi.nlm.nih.gov/pubmed/28881961 http://dx.doi.org/10.1093/bioinformatics/btx226
work_keys_str_mv	AT clevertdjorkarne rectifiedfactornetworksforbiclusteringofomicsdata AT unterthinerthomas rectifiedfactornetworksforbiclusteringofomicsdata AT povysilgundula rectifiedfactornetworksforbiclusteringofomicsdata AT hochreitersepp rectifiedfactornetworksforbiclusteringofomicsdata

Rectified factor networks for biclustering of omics data

Ejemplares similares