Cargando…

Pattern statistics on Markov chains and sensitivity to parameter estimation

BACKGROUND: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and wh...

Descripción completa

Detalles Bibliográficos
Autor principal: Nuel, Grégory
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1647278/
https://www.ncbi.nlm.nih.gov/pubmed/17044916
http://dx.doi.org/10.1186/1748-7188-1-17
_version_ 1782130999139762176
author Nuel, Grégory
author_facet Nuel, Grégory
author_sort Nuel, Grégory
collection PubMed
description BACKGROUND: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). RESULTS: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. CONCLUSION: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.
format Text
id pubmed-1647278
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16472782006-11-22 Pattern statistics on Markov chains and sensitivity to parameter estimation Nuel, Grégory Algorithms Mol Biol Research BACKGROUND: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). RESULTS: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. CONCLUSION: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation. BioMed Central 2006-10-17 /pmc/articles/PMC1647278/ /pubmed/17044916 http://dx.doi.org/10.1186/1748-7188-1-17 Text en Copyright © 2006 Nuel; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Nuel, Grégory
Pattern statistics on Markov chains and sensitivity to parameter estimation
title Pattern statistics on Markov chains and sensitivity to parameter estimation
title_full Pattern statistics on Markov chains and sensitivity to parameter estimation
title_fullStr Pattern statistics on Markov chains and sensitivity to parameter estimation
title_full_unstemmed Pattern statistics on Markov chains and sensitivity to parameter estimation
title_short Pattern statistics on Markov chains and sensitivity to parameter estimation
title_sort pattern statistics on markov chains and sensitivity to parameter estimation
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1647278/
https://www.ncbi.nlm.nih.gov/pubmed/17044916
http://dx.doi.org/10.1186/1748-7188-1-17
work_keys_str_mv AT nuelgregory patternstatisticsonmarkovchainsandsensitivitytoparameterestimation