Cargando…

Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements

BACKGROUND: Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. RESULTS: We describe an improvement to the A-GL...

Descripción completa

Detalles Bibliográficos
Autores principales: Tharakaraman, Kannan, Mariño-Ramírez, Leonardo, Sheetlin, Sergey L, Landsman, David, Spouge, John L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1599759/
https://www.ncbi.nlm.nih.gov/pubmed/16961919
http://dx.doi.org/10.1186/1471-2105-7-408
_version_ 1782130445091078144
author Tharakaraman, Kannan
Mariño-Ramírez, Leonardo
Sheetlin, Sergey L
Landsman, David
Spouge, John L
author_facet Tharakaraman, Kannan
Mariño-Ramírez, Leonardo
Sheetlin, Sergey L
Landsman, David
Spouge, John L
author_sort Tharakaraman, Kannan
collection PubMed
description BACKGROUND: Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. RESULTS: We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. CONCLUSION: Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances.
format Text
id pubmed-1599759
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15997592006-10-12 Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements Tharakaraman, Kannan Mariño-Ramírez, Leonardo Sheetlin, Sergey L Landsman, David Spouge, John L BMC Bioinformatics Software BACKGROUND: Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. RESULTS: We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. CONCLUSION: Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances. BioMed Central 2006-09-08 /pmc/articles/PMC1599759/ /pubmed/16961919 http://dx.doi.org/10.1186/1471-2105-7-408 Text en Copyright © 2006 Tharakaraman et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Tharakaraman, Kannan
Mariño-Ramírez, Leonardo
Sheetlin, Sergey L
Landsman, David
Spouge, John L
Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title_full Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title_fullStr Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title_full_unstemmed Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title_short Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
title_sort scanning sequences after gibbs sampling to find multiple occurrences of functional elements
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1599759/
https://www.ncbi.nlm.nih.gov/pubmed/16961919
http://dx.doi.org/10.1186/1471-2105-7-408
work_keys_str_mv AT tharakaramankannan scanningsequencesaftergibbssamplingtofindmultipleoccurrencesoffunctionalelements
AT marinoramirezleonardo scanningsequencesaftergibbssamplingtofindmultipleoccurrencesoffunctionalelements
AT sheetlinsergeyl scanningsequencesaftergibbssamplingtofindmultipleoccurrencesoffunctionalelements
AT landsmandavid scanningsequencesaftergibbssamplingtofindmultipleoccurrencesoffunctionalelements
AT spougejohnl scanningsequencesaftergibbssamplingtofindmultipleoccurrencesoffunctionalelements