Cargando…

Seed-Based Biclustering of Gene Expression Data

BACKGROUND: Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related gen...

Descripción completa

Detalles Bibliográficos
Autores principales:	An, Jiyuan, Liew, Alan Wee-Chung, Nelson, Colleen C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3411756/ https://www.ncbi.nlm.nih.gov/pubmed/22879981 http://dx.doi.org/10.1371/journal.pone.0042431

_version_	1782239888927621120
author	An, Jiyuan Liew, Alan Wee-Chung Nelson, Colleen C.
author_facet	An, Jiyuan Liew, Alan Wee-Chung Nelson, Colleen C.
author_sort	An, Jiyuan
collection	PubMed
description	BACKGROUND: Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related genes generally do not show coherent expression across all conditions since any given cellular process is active only under a subset of conditions. Biclustering finds gene clusters that have similar expression levels across a subset of conditions. This paper proposes a seed-based algorithm that identifies coherent genes in an exhaustive, but efficient manner. METHODS: In order to find the biclusters in a gene expression dataset, we exhaustively select combinations of genes and conditions as seeds to create candidate bicluster tables. The tables have two columns (a) a gene set, and (b) the conditions on which the gene set have dissimilar expression levels to the seed. First, the genes with less than the maximum number of dissimilar conditions are identified and a table of these genes is created. Second, the rows that have the same dissimilar conditions are grouped together. Third, the table is sorted in ascending order based on the number of dissimilar conditions. Finally, beginning with the first row of the table, a test is run repeatedly to determine whether the cardinality of the gene set in the row is greater than the minimum threshold number of genes in a bicluster. If so, a bicluster is outputted and the corresponding row is removed from the table. Repeating this process, all biclusters in the table are systematically identified until the table becomes empty. CONCLUSIONS: This paper presents a novel biclustering algorithm for the identification of additive biclusters. Since it involves exhaustively testing combinations of genes and conditions, the additive biclusters can be found more readily.
format	Online Article Text
id	pubmed-3411756
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-34117562012-08-09 Seed-Based Biclustering of Gene Expression Data An, Jiyuan Liew, Alan Wee-Chung Nelson, Colleen C. PLoS One Research Article BACKGROUND: Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related genes generally do not show coherent expression across all conditions since any given cellular process is active only under a subset of conditions. Biclustering finds gene clusters that have similar expression levels across a subset of conditions. This paper proposes a seed-based algorithm that identifies coherent genes in an exhaustive, but efficient manner. METHODS: In order to find the biclusters in a gene expression dataset, we exhaustively select combinations of genes and conditions as seeds to create candidate bicluster tables. The tables have two columns (a) a gene set, and (b) the conditions on which the gene set have dissimilar expression levels to the seed. First, the genes with less than the maximum number of dissimilar conditions are identified and a table of these genes is created. Second, the rows that have the same dissimilar conditions are grouped together. Third, the table is sorted in ascending order based on the number of dissimilar conditions. Finally, beginning with the first row of the table, a test is run repeatedly to determine whether the cardinality of the gene set in the row is greater than the minimum threshold number of genes in a bicluster. If so, a bicluster is outputted and the corresponding row is removed from the table. Repeating this process, all biclusters in the table are systematically identified until the table becomes empty. CONCLUSIONS: This paper presents a novel biclustering algorithm for the identification of additive biclusters. Since it involves exhaustively testing combinations of genes and conditions, the additive biclusters can be found more readily. Public Library of Science 2012-08-03 /pmc/articles/PMC3411756/ /pubmed/22879981 http://dx.doi.org/10.1371/journal.pone.0042431 Text en © 2012 An et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article An, Jiyuan Liew, Alan Wee-Chung Nelson, Colleen C. Seed-Based Biclustering of Gene Expression Data
title	Seed-Based Biclustering of Gene Expression Data
title_full	Seed-Based Biclustering of Gene Expression Data
title_fullStr	Seed-Based Biclustering of Gene Expression Data
title_full_unstemmed	Seed-Based Biclustering of Gene Expression Data
title_short	Seed-Based Biclustering of Gene Expression Data
title_sort	seed-based biclustering of gene expression data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3411756/ https://www.ncbi.nlm.nih.gov/pubmed/22879981 http://dx.doi.org/10.1371/journal.pone.0042431
work_keys_str_mv	AT anjiyuan seedbasedbiclusteringofgeneexpressiondata AT liewalanweechung seedbasedbiclusteringofgeneexpressiondata AT nelsoncolleenc seedbasedbiclusteringofgeneexpressiondata

Seed-Based Biclustering of Gene Expression Data

Ejemplares similares