Cargando…

Optimized mixed Markov models for motif identification

BACKGROUND: Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. RESULTS: We introduce a novel and flexible model, the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huang, Weichun, Umbach, David M, Ohler, Uwe, Li, Leping
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534070/ https://www.ncbi.nlm.nih.gov/pubmed/16749929 http://dx.doi.org/10.1186/1471-2105-7-279

_version_	1782129100073205760
author	Huang, Weichun Umbach, David M Ohler, Uwe Li, Leping
author_facet	Huang, Weichun Umbach, David M Ohler, Uwe Li, Leping
author_sort	Huang, Weichun
collection	PubMed
description	BACKGROUND: Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. RESULTS: We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at [1]. CONCLUSION: Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods.
format	Text
id	pubmed-1534070
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-15340702006-08-10 Optimized mixed Markov models for motif identification Huang, Weichun Umbach, David M Ohler, Uwe Li, Leping BMC Bioinformatics Methodology Article BACKGROUND: Identifying functional elements, such as transcriptional factor binding sites, is a fundamental step in reconstructing gene regulatory networks and remains a challenging issue, largely due to limited availability of training samples. RESULTS: We introduce a novel and flexible model, the Optimized Mixture Markov model (OMiMa), and related methods to allow adjustment of model complexity for different motifs. In comparison with other leading methods, OMiMa can incorporate more than the NNSplice's pairwise dependencies; OMiMa avoids model over-fitting better than the Permuted Variable Length Markov Model (PVLMM); and OMiMa requires smaller training samples than the Maximum Entropy Model (MEM). Testing on both simulated and actual data (regulatory cis-elements and splice sites), we found OMiMa's performance superior to the other leading methods in terms of prediction accuracy, required size of training data or computational time. Our OMiMa system, to our knowledge, is the only motif finding tool that incorporates automatic selection of the best model. OMiMa is freely available at [1]. CONCLUSION: Our optimized mixture of Markov models represents an alternative to the existing methods for modeling dependent structures within a biological motif. Our model is conceptually simple and effective, and can improve prediction accuracy and/or computational speed over other leading methods. BioMed Central 2006-06-02 /pmc/articles/PMC1534070/ /pubmed/16749929 http://dx.doi.org/10.1186/1471-2105-7-279 Text en Copyright © 2006 Huang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Huang, Weichun Umbach, David M Ohler, Uwe Li, Leping Optimized mixed Markov models for motif identification
title	Optimized mixed Markov models for motif identification
title_full	Optimized mixed Markov models for motif identification
title_fullStr	Optimized mixed Markov models for motif identification
title_full_unstemmed	Optimized mixed Markov models for motif identification
title_short	Optimized mixed Markov models for motif identification
title_sort	optimized mixed markov models for motif identification
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534070/ https://www.ncbi.nlm.nih.gov/pubmed/16749929 http://dx.doi.org/10.1186/1471-2105-7-279
work_keys_str_mv	AT huangweichun optimizedmixedmarkovmodelsformotifidentification AT umbachdavidm optimizedmixedmarkovmodelsformotifidentification AT ohleruwe optimizedmixedmarkovmodelsformotifidentification AT lileping optimizedmixedmarkovmodelsformotifidentification

Optimized mixed Markov models for motif identification

Ejemplares similares