Cargando…

Extracting regulatory modules from gene expression data by sequential pattern mining

BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extractin...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Mingoo, Shin, Hyunjung, Su Chung, Tae, Joung, Je-Gun, Kim, Ju Han
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3333188/
https://www.ncbi.nlm.nih.gov/pubmed/22369275
http://dx.doi.org/10.1186/1471-2164-12-S3-S5
_version_ 1782230393172262912
author Kim, Mingoo
Shin, Hyunjung
Su Chung, Tae
Joung, Je-Gun
Kim, Ju Han
author_facet Kim, Mingoo
Shin, Hyunjung
Su Chung, Tae
Joung, Je-Gun
Kim, Ju Han
author_sort Kim, Mingoo
collection PubMed
description BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value. RESULTS: In this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns. CONCLUSIONS: Our approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level.
format Online
Article
Text
id pubmed-3333188
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33331882012-04-24 Extracting regulatory modules from gene expression data by sequential pattern mining Kim, Mingoo Shin, Hyunjung Su Chung, Tae Joung, Je-Gun Kim, Ju Han BMC Genomics Proceedings BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value. RESULTS: In this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns. CONCLUSIONS: Our approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level. BioMed Central 2011-11-30 /pmc/articles/PMC3333188/ /pubmed/22369275 http://dx.doi.org/10.1186/1471-2164-12-S3-S5 Text en Copyright ©2011 Kim et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Kim, Mingoo
Shin, Hyunjung
Su Chung, Tae
Joung, Je-Gun
Kim, Ju Han
Extracting regulatory modules from gene expression data by sequential pattern mining
title Extracting regulatory modules from gene expression data by sequential pattern mining
title_full Extracting regulatory modules from gene expression data by sequential pattern mining
title_fullStr Extracting regulatory modules from gene expression data by sequential pattern mining
title_full_unstemmed Extracting regulatory modules from gene expression data by sequential pattern mining
title_short Extracting regulatory modules from gene expression data by sequential pattern mining
title_sort extracting regulatory modules from gene expression data by sequential pattern mining
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3333188/
https://www.ncbi.nlm.nih.gov/pubmed/22369275
http://dx.doi.org/10.1186/1471-2164-12-S3-S5
work_keys_str_mv AT kimmingoo extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining
AT shinhyunjung extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining
AT suchungtae extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining
AT joungjegun extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining
AT kimjuhan extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining