Cargando…

DNA motif elucidation using belief propagation

Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ∼10); such comprehensive...

Descripción completa

Detalles Bibliográficos
Autores principales: Wong, Ka-Chun, Chan, Tak-Ming, Peng, Chengbin, Li, Yue, Zhang, Zhaolei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3763557/
https://www.ncbi.nlm.nih.gov/pubmed/23814189
http://dx.doi.org/10.1093/nar/gkt574
_version_ 1782283034929659904
author Wong, Ka-Chun
Chan, Tak-Ming
Peng, Chengbin
Li, Yue
Zhang, Zhaolei
author_facet Wong, Ka-Chun
Chan, Tak-Ming
Peng, Chengbin
Li, Yue
Zhang, Zhaolei
author_sort Wong, Ka-Chun
collection PubMed
description Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors’ websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.
format Online
Article
Text
id pubmed-3763557
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-37635572013-09-10 DNA motif elucidation using belief propagation Wong, Ka-Chun Chan, Tak-Ming Peng, Chengbin Li, Yue Zhang, Zhaolei Nucleic Acids Res Methods Online Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors’ websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM. Oxford University Press 2013-09 2013-06-29 /pmc/articles/PMC3763557/ /pubmed/23814189 http://dx.doi.org/10.1093/nar/gkt574 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Wong, Ka-Chun
Chan, Tak-Ming
Peng, Chengbin
Li, Yue
Zhang, Zhaolei
DNA motif elucidation using belief propagation
title DNA motif elucidation using belief propagation
title_full DNA motif elucidation using belief propagation
title_fullStr DNA motif elucidation using belief propagation
title_full_unstemmed DNA motif elucidation using belief propagation
title_short DNA motif elucidation using belief propagation
title_sort dna motif elucidation using belief propagation
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3763557/
https://www.ncbi.nlm.nih.gov/pubmed/23814189
http://dx.doi.org/10.1093/nar/gkt574
work_keys_str_mv AT wongkachun dnamotifelucidationusingbeliefpropagation
AT chantakming dnamotifelucidationusingbeliefpropagation
AT pengchengbin dnamotifelucidationusingbeliefpropagation
AT liyue dnamotifelucidationusingbeliefpropagation
AT zhangzhaolei dnamotifelucidationusingbeliefpropagation