Cargando…

Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder

We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharov, Alexei A., Ko, Minoru S.H.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762409/
https://www.ncbi.nlm.nih.gov/pubmed/19740934
http://dx.doi.org/10.1093/dnares/dsp014
_version_ 1782172918419030016
author Sharov, Alexei A.
Ko, Minoru S.H.
author_facet Sharov, Alexei A.
Ko, Minoru S.H.
author_sort Sharov, Alexei A.
collection PubMed
description We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences.
format Text
id pubmed-2762409
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27624092009-10-15 Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder Sharov, Alexei A. Ko, Minoru S.H. DNA Res Full Papers We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences. Oxford University Press 2009-10 2009-09-09 /pmc/articles/PMC2762409/ /pubmed/19740934 http://dx.doi.org/10.1093/dnares/dsp014 Text en © The Author 2009. Published by Oxford University Press on behalf of Kazusa DNA Research Institute http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Full Papers
Sharov, Alexei A.
Ko, Minoru S.H.
Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title_full Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title_fullStr Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title_full_unstemmed Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title_short Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
title_sort exhaustive search for over-represented dna sequence motifs with cisfinder
topic Full Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2762409/
https://www.ncbi.nlm.nih.gov/pubmed/19740934
http://dx.doi.org/10.1093/dnares/dsp014
work_keys_str_mv AT sharovalexeia exhaustivesearchforoverrepresenteddnasequencemotifswithcisfinder
AT kominorush exhaustivesearchforoverrepresenteddnasequencemotifswithcisfinder