Cargando…
Bayesian Centroid Estimation for Motif Discovery
Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common mot...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855595/ https://www.ncbi.nlm.nih.gov/pubmed/24324603 http://dx.doi.org/10.1371/journal.pone.0080511 |
_version_ | 1782294942991777792 |
---|---|
author | Carvalho, Luis |
author_facet | Carvalho, Luis |
author_sort | Carvalho, Luis |
collection | PubMed |
description | Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators. |
format | Online Article Text |
id | pubmed-3855595 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-38555952013-12-09 Bayesian Centroid Estimation for Motif Discovery Carvalho, Luis PLoS One Research Article Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators. Public Library of Science 2013-12-06 /pmc/articles/PMC3855595/ /pubmed/24324603 http://dx.doi.org/10.1371/journal.pone.0080511 Text en © 2013 Luis Carvalho http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Carvalho, Luis Bayesian Centroid Estimation for Motif Discovery |
title | Bayesian Centroid Estimation for Motif Discovery |
title_full | Bayesian Centroid Estimation for Motif Discovery |
title_fullStr | Bayesian Centroid Estimation for Motif Discovery |
title_full_unstemmed | Bayesian Centroid Estimation for Motif Discovery |
title_short | Bayesian Centroid Estimation for Motif Discovery |
title_sort | bayesian centroid estimation for motif discovery |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3855595/ https://www.ncbi.nlm.nih.gov/pubmed/24324603 http://dx.doi.org/10.1371/journal.pone.0080511 |
work_keys_str_mv | AT carvalholuis bayesiancentroidestimationformotifdiscovery |