Cargando…

Statistical tests to compare motif count exceptionalities

BACKGROUND: Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing th...

Descripción completa

Detalles Bibliográficos
Autores principales: Robin, Stéphane, Schbath, Sophie, Vandewalle, Vincent
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1838430/
https://www.ncbi.nlm.nih.gov/pubmed/17346349
http://dx.doi.org/10.1186/1471-2105-8-84
_version_ 1782132829509910528
author Robin, Stéphane
Schbath, Sophie
Vandewalle, Vincent
author_facet Robin, Stéphane
Schbath, Sophie
Vandewalle, Vincent
author_sort Robin, Stéphane
collection PubMed
description BACKGROUND: Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. RESULTS: We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. CONCLUSION: The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use.
format Text
id pubmed-1838430
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18384302007-04-04 Statistical tests to compare motif count exceptionalities Robin, Stéphane Schbath, Sophie Vandewalle, Vincent BMC Bioinformatics Methodology Article BACKGROUND: Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. RESULTS: We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. CONCLUSION: The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. BioMed Central 2007-03-08 /pmc/articles/PMC1838430/ /pubmed/17346349 http://dx.doi.org/10.1186/1471-2105-8-84 Text en Copyright © 2007 Robin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Robin, Stéphane
Schbath, Sophie
Vandewalle, Vincent
Statistical tests to compare motif count exceptionalities
title Statistical tests to compare motif count exceptionalities
title_full Statistical tests to compare motif count exceptionalities
title_fullStr Statistical tests to compare motif count exceptionalities
title_full_unstemmed Statistical tests to compare motif count exceptionalities
title_short Statistical tests to compare motif count exceptionalities
title_sort statistical tests to compare motif count exceptionalities
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1838430/
https://www.ncbi.nlm.nih.gov/pubmed/17346349
http://dx.doi.org/10.1186/1471-2105-8-84
work_keys_str_mv AT robinstephane statisticalteststocomparemotifcountexceptionalities
AT schbathsophie statisticalteststocomparemotifcountexceptionalities
AT vandewallevincent statisticalteststocomparemotifcountexceptionalities