Cargando…

An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets

BACKGROUND: Observation of gene expression changes implying gene regulations using a repetitive experiment in time course has become more and more important. However, there is no effective method which can handle such kind of data. For instance, in a clinical/biological progression like inflammatory...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cheng, Chun-Pei, Liu, Yu-Cheng, Tsai, Yi-Lin, Tseng, Vincent S
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848764/ https://www.ncbi.nlm.nih.gov/pubmed/24267918 http://dx.doi.org/10.1186/1471-2105-14-S12-S3

_version_	1782293816449957888
author	Cheng, Chun-Pei Liu, Yu-Cheng Tsai, Yi-Lin Tseng, Vincent S
author_facet	Cheng, Chun-Pei Liu, Yu-Cheng Tsai, Yi-Lin Tseng, Vincent S
author_sort	Cheng, Chun-Pei
collection	PubMed
description	BACKGROUND: Observation of gene expression changes implying gene regulations using a repetitive experiment in time course has become more and more important. However, there is no effective method which can handle such kind of data. For instance, in a clinical/biological progression like inflammatory response or cancer formation, a great number of differentially expressed genes at different time points could be identified through a large-scale microarray approach. For each repetitive experiment with different samples, converting the microarray datasets into transactional databases with significant singleton genes at each time point would allow sequential patterns implying gene regulations to be identified. Although traditional sequential pattern mining methods have been successfully proposed and widely used in different interesting topics, like mining customer purchasing sequences from a transactional database, to our knowledge, the methods are not suitable for such biological dataset because every transaction in the converted database may contain too many items/genes. RESULTS: In this paper, we propose a new algorithm called CTGR-Span (Cross-Timepoint Gene Regulation Sequential pattern) to efficiently mine CTGR-SPs (Cross-Timepoint Gene Regulation Sequential Patterns) even on larger datasets where traditional algorithms are infeasible. The CTGR-Span includes several biologically designed parameters based on the characteristics of gene regulation. We perform an optimal parameter tuning process using a GO enrichment analysis to yield CTGR-SPs more meaningful biologically. The proposed method was evaluated with two publicly available human time course microarray datasets and it was shown that it outperformed the traditional methods in terms of execution efficiency. After evaluating with previous literature, the resulting patterns also strongly correlated with the experimental backgrounds of the datasets used in this study. CONCLUSIONS: We propose an efficient CTGR-Span to mine several biologically meaningful CTGR-SPs. We postulate that the biologist can benefit from our new algorithm since the patterns implying gene regulations could provide further insights into the mechanisms of novel gene regulations during a biological or clinical progression. The Java source code, program tutorial and other related materials used in this program are available at http://websystem.csie.ncku.edu.tw/CTGR-Span.rar.
format	Online Article Text
id	pubmed-3848764
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38487642013-12-09 An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets Cheng, Chun-Pei Liu, Yu-Cheng Tsai, Yi-Lin Tseng, Vincent S BMC Bioinformatics Research BACKGROUND: Observation of gene expression changes implying gene regulations using a repetitive experiment in time course has become more and more important. However, there is no effective method which can handle such kind of data. For instance, in a clinical/biological progression like inflammatory response or cancer formation, a great number of differentially expressed genes at different time points could be identified through a large-scale microarray approach. For each repetitive experiment with different samples, converting the microarray datasets into transactional databases with significant singleton genes at each time point would allow sequential patterns implying gene regulations to be identified. Although traditional sequential pattern mining methods have been successfully proposed and widely used in different interesting topics, like mining customer purchasing sequences from a transactional database, to our knowledge, the methods are not suitable for such biological dataset because every transaction in the converted database may contain too many items/genes. RESULTS: In this paper, we propose a new algorithm called CTGR-Span (Cross-Timepoint Gene Regulation Sequential pattern) to efficiently mine CTGR-SPs (Cross-Timepoint Gene Regulation Sequential Patterns) even on larger datasets where traditional algorithms are infeasible. The CTGR-Span includes several biologically designed parameters based on the characteristics of gene regulation. We perform an optimal parameter tuning process using a GO enrichment analysis to yield CTGR-SPs more meaningful biologically. The proposed method was evaluated with two publicly available human time course microarray datasets and it was shown that it outperformed the traditional methods in terms of execution efficiency. After evaluating with previous literature, the resulting patterns also strongly correlated with the experimental backgrounds of the datasets used in this study. CONCLUSIONS: We propose an efficient CTGR-Span to mine several biologically meaningful CTGR-SPs. We postulate that the biologist can benefit from our new algorithm since the patterns implying gene regulations could provide further insights into the mechanisms of novel gene regulations during a biological or clinical progression. The Java source code, program tutorial and other related materials used in this program are available at http://websystem.csie.ncku.edu.tw/CTGR-Span.rar. BioMed Central 2013-09-24 /pmc/articles/PMC3848764/ /pubmed/24267918 http://dx.doi.org/10.1186/1471-2105-14-S12-S3 Text en Copyright © 2013 Cheng et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Cheng, Chun-Pei Liu, Yu-Cheng Tsai, Yi-Lin Tseng, Vincent S An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title	An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title_full	An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title_fullStr	An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title_full_unstemmed	An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title_short	An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
title_sort	efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848764/ https://www.ncbi.nlm.nih.gov/pubmed/24267918 http://dx.doi.org/10.1186/1471-2105-14-S12-S3
work_keys_str_mv	AT chengchunpei anefficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT liuyucheng anefficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT tsaiyilin anefficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT tsengvincents anefficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT chengchunpei efficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT liuyucheng efficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT tsaiyilin efficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets AT tsengvincents efficientmethodforminingcrosstimepointgeneregulationsequentialpatternsfromtimecoursegeneexpressiondatasets

An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets

Ejemplares similares