Cargando…
Extracting regulatory modules from gene expression data by sequential pattern mining
BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extractin...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3333188/ https://www.ncbi.nlm.nih.gov/pubmed/22369275 http://dx.doi.org/10.1186/1471-2164-12-S3-S5 |
_version_ | 1782230393172262912 |
---|---|
author | Kim, Mingoo Shin, Hyunjung Su Chung, Tae Joung, Je-Gun Kim, Ju Han |
author_facet | Kim, Mingoo Shin, Hyunjung Su Chung, Tae Joung, Je-Gun Kim, Ju Han |
author_sort | Kim, Mingoo |
collection | PubMed |
description | BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value. RESULTS: In this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns. CONCLUSIONS: Our approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level. |
format | Online Article Text |
id | pubmed-3333188 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33331882012-04-24 Extracting regulatory modules from gene expression data by sequential pattern mining Kim, Mingoo Shin, Hyunjung Su Chung, Tae Joung, Je-Gun Kim, Ju Han BMC Genomics Proceedings BACKGROUND: Identifying a regulatory module (RM), a bi-set of co-regulated genes and co-regulating conditions (or samples), has been an important challenge in functional genomics and bioinformatics. Given a microarray gene-expression matrix, biclustering has been the most common method for extracting RMs. Among biclustering methods, order-preserving biclustering by a sequential pattern mining technique has native advantage over the conventional biclustering approaches since it preserves the order of genes (or conditions) according to the magnitude of the expression value. However, previous sequential pattern mining-based biclustering has several weak points in that they can easily be computationally intractable in the real-size of microarray data and sensitive to inherent noise in the expression value. RESULTS: In this paper, we propose a novel sequential pattern mining algorithm that is scalable in the size of microarray data and robust with respect to noise. When applied to the microarray data of yeast, the proposed algorithm successfully found long order-preserving patterns, which are biologically significant but cannot be found in randomly shuffled data. The resulting patterns are well enriched to known annotations and are consistent with known biological knowledge. Furthermore, RMs as well as inter-module relations were inferred from the biologically significant patterns. CONCLUSIONS: Our approach for identifying RMs could be valuable for systematically revealing the mechanism of gene regulation at a genome-wide level. BioMed Central 2011-11-30 /pmc/articles/PMC3333188/ /pubmed/22369275 http://dx.doi.org/10.1186/1471-2164-12-S3-S5 Text en Copyright ©2011 Kim et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Kim, Mingoo Shin, Hyunjung Su Chung, Tae Joung, Je-Gun Kim, Ju Han Extracting regulatory modules from gene expression data by sequential pattern mining |
title | Extracting regulatory modules from gene expression data by sequential pattern mining |
title_full | Extracting regulatory modules from gene expression data by sequential pattern mining |
title_fullStr | Extracting regulatory modules from gene expression data by sequential pattern mining |
title_full_unstemmed | Extracting regulatory modules from gene expression data by sequential pattern mining |
title_short | Extracting regulatory modules from gene expression data by sequential pattern mining |
title_sort | extracting regulatory modules from gene expression data by sequential pattern mining |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3333188/ https://www.ncbi.nlm.nih.gov/pubmed/22369275 http://dx.doi.org/10.1186/1471-2164-12-S3-S5 |
work_keys_str_mv | AT kimmingoo extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining AT shinhyunjung extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining AT suchungtae extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining AT joungjegun extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining AT kimjuhan extractingregulatorymodulesfromgeneexpressiondatabysequentialpatternmining |