Cargando…
NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence
NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1064142/ https://www.ncbi.nlm.nih.gov/pubmed/15760844 http://dx.doi.org/10.1093/nar/gki282 |
_version_ | 1782123307411177472 |
---|---|
author | Down, Thomas A. Hubbard, Tim J. P. |
author_facet | Down, Thomas A. Hubbard, Tim J. P. |
author_sort | Down, Thomas A. |
collection | PubMed |
description | NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites. |
format | Text |
id | pubmed-1064142 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-10641422005-03-10 NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence Down, Thomas A. Hubbard, Tim J. P. Nucleic Acids Res Article NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites. Oxford University Press 2005 2005-03-10 /pmc/articles/PMC1064142/ /pubmed/15760844 http://dx.doi.org/10.1093/nar/gki282 Text en © The Author 2005. Published by Oxford University Press. All rights reserved |
spellingShingle | Article Down, Thomas A. Hubbard, Tim J. P. NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title | NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title_full | NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title_fullStr | NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title_full_unstemmed | NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title_short | NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence |
title_sort | nestedmica: sensitive inference of over-represented motifs in nucleic acid sequence |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1064142/ https://www.ncbi.nlm.nih.gov/pubmed/15760844 http://dx.doi.org/10.1093/nar/gki282 |
work_keys_str_mv | AT downthomasa nestedmicasensitiveinferenceofoverrepresentedmotifsinnucleicacidsequence AT hubbardtimjp nestedmicasensitiveinferenceofoverrepresentedmotifsinnucleicacidsequence |