Cargando…

MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections

The potential to understand fundamental biological processes from gene expression data has grown in parallel with the recent explosion of the size of data collections. However, to exploit this potential, novel analytical methods are required, capable of discovering large co-regulated gene networks....

Descripción completa

Detalles Bibliográficos
Autores principales:	Bentham, Robert B., Bryson, Kevin, Szabadkai, Gyorgy
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2017
Materias:	Computational Biology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5587796/ https://www.ncbi.nlm.nih.gov/pubmed/28911113 http://dx.doi.org/10.1093/nar/gkx590

_version_	1783262062370095104
author	Bentham, Robert B. Bryson, Kevin Szabadkai, Gyorgy
author_facet	Bentham, Robert B. Bryson, Kevin Szabadkai, Gyorgy
author_sort	Bentham, Robert B.
collection	PubMed
description	The potential to understand fundamental biological processes from gene expression data has grown in parallel with the recent explosion of the size of data collections. However, to exploit this potential, novel analytical methods are required, capable of discovering large co-regulated gene networks. We found current methods limited in the size of correlated gene sets they could discover within biologically heterogeneous data collections, hampering the identification of multi-gene controlled fundamental cellular processes such as energy metabolism, organelle biogenesis and stress responses. Here we describe a novel biclustering algorithm called Massively Correlated Biclustering (MCbiclust) that selects samples and genes from large datasets with maximal correlated gene expression, allowing regulation of complex networks to be examined. The method has been evaluated using synthetic data and applied to large bacterial and cancer cell datasets. We show that the large biclusters discovered, so far elusive to identification by existing techniques, are biologically relevant and thus MCbiclust has great potential in the analysis of transcriptomics data to identify large-scale unknown effects hidden within the data. The identified massive biclusters can be used to develop improved transcriptomics based diagnosis tools for diseases caused by altered gene expression, or used for further network analysis to understand genotype-phenotype correlations.
format	Online Article Text
id	pubmed-5587796
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-55877962017-09-11 MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections Bentham, Robert B. Bryson, Kevin Szabadkai, Gyorgy Nucleic Acids Res Computational Biology The potential to understand fundamental biological processes from gene expression data has grown in parallel with the recent explosion of the size of data collections. However, to exploit this potential, novel analytical methods are required, capable of discovering large co-regulated gene networks. We found current methods limited in the size of correlated gene sets they could discover within biologically heterogeneous data collections, hampering the identification of multi-gene controlled fundamental cellular processes such as energy metabolism, organelle biogenesis and stress responses. Here we describe a novel biclustering algorithm called Massively Correlated Biclustering (MCbiclust) that selects samples and genes from large datasets with maximal correlated gene expression, allowing regulation of complex networks to be examined. The method has been evaluated using synthetic data and applied to large bacterial and cancer cell datasets. We show that the large biclusters discovered, so far elusive to identification by existing techniques, are biologically relevant and thus MCbiclust has great potential in the analysis of transcriptomics data to identify large-scale unknown effects hidden within the data. The identified massive biclusters can be used to develop improved transcriptomics based diagnosis tools for diseases caused by altered gene expression, or used for further network analysis to understand genotype-phenotype correlations. Oxford University Press 2017-09-06 2017-07-14 /pmc/articles/PMC5587796/ /pubmed/28911113 http://dx.doi.org/10.1093/nar/gkx590 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Computational Biology Bentham, Robert B. Bryson, Kevin Szabadkai, Gyorgy MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title	MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title_full	MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title_fullStr	MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title_full_unstemmed	MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title_short	MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
title_sort	mcbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections
topic	Computational Biology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5587796/ https://www.ncbi.nlm.nih.gov/pubmed/28911113 http://dx.doi.org/10.1093/nar/gkx590
work_keys_str_mv	AT benthamrobertb mcbiclustanovelalgorithmtodiscoverlargescalefunctionallyrelatedgenesetsfrommassivetranscriptomicsdatacollections AT brysonkevin mcbiclustanovelalgorithmtodiscoverlargescalefunctionallyrelatedgenesetsfrommassivetranscriptomicsdatacollections AT szabadkaigyorgy mcbiclustanovelalgorithmtodiscoverlargescalefunctionallyrelatedgenesetsfrommassivetranscriptomicsdatacollections

MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections

Ejemplares similares