Cargando…

Bayesian Markov models improve the prediction of binding motifs beyond first order

Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly described by position weight matrices, which assume that each posi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ge, Wanwan, Meier, Markus, Roth, Christian, Söding, Johannes
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Methods and Benchmark Surveys
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057495/ https://www.ncbi.nlm.nih.gov/pubmed/33928244 http://dx.doi.org/10.1093/nargab/lqab026

_version_	1783680849643831296
author	Ge, Wanwan Meier, Markus Roth, Christian Söding, Johannes
author_facet	Ge, Wanwan Meier, Markus Roth, Christian Söding, Johannes
author_sort	Ge, Wanwan
collection	PubMed
description	Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly described by position weight matrices, which assume that each position contributes independently to the binding energy. Models that can learn dependencies between positions, for instance, induced by DNA structure preferences, have yielded markedly improved predictions for most TFs on in vivo data. However, they are more prone to overfit the data and to learn patterns merely correlated with rather than directly involved in TF binding. We present an improved, faster version of our Bayesian Markov model software, BaMMmotif2. We tested it with state-of-the-art motif discovery tools on a large collection of ChIP-seq and HT-SELEX datasets. BaMMmotif2 models of fifth-order achieved a median false-discovery-rate-averaged recall 13.6% and 12.2% higher than the next best tool on 427 ChIP-seq datasets and 164 HT-SELEX datasets, respectively, while being 8 to 1000 times faster. BaMMmotif2 models showed no signs of overtraining in cross-cell line and cross-platform tests, with similar improvements on the next-best tool. These results demonstrate that dependencies beyond first order clearly improve binding models for most TFs.
format	Online Article Text
id	pubmed-8057495
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-80574952021-04-28 Bayesian Markov models improve the prediction of binding motifs beyond first order Ge, Wanwan Meier, Markus Roth, Christian Söding, Johannes NAR Genom Bioinform Methods and Benchmark Surveys Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly described by position weight matrices, which assume that each position contributes independently to the binding energy. Models that can learn dependencies between positions, for instance, induced by DNA structure preferences, have yielded markedly improved predictions for most TFs on in vivo data. However, they are more prone to overfit the data and to learn patterns merely correlated with rather than directly involved in TF binding. We present an improved, faster version of our Bayesian Markov model software, BaMMmotif2. We tested it with state-of-the-art motif discovery tools on a large collection of ChIP-seq and HT-SELEX datasets. BaMMmotif2 models of fifth-order achieved a median false-discovery-rate-averaged recall 13.6% and 12.2% higher than the next best tool on 427 ChIP-seq datasets and 164 HT-SELEX datasets, respectively, while being 8 to 1000 times faster. BaMMmotif2 models showed no signs of overtraining in cross-cell line and cross-platform tests, with similar improvements on the next-best tool. These results demonstrate that dependencies beyond first order clearly improve binding models for most TFs. Oxford University Press 2021-04-20 /pmc/articles/PMC8057495/ /pubmed/33928244 http://dx.doi.org/10.1093/nargab/lqab026 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Methods and Benchmark Surveys Ge, Wanwan Meier, Markus Roth, Christian Söding, Johannes Bayesian Markov models improve the prediction of binding motifs beyond first order
title	Bayesian Markov models improve the prediction of binding motifs beyond first order
title_full	Bayesian Markov models improve the prediction of binding motifs beyond first order
title_fullStr	Bayesian Markov models improve the prediction of binding motifs beyond first order
title_full_unstemmed	Bayesian Markov models improve the prediction of binding motifs beyond first order
title_short	Bayesian Markov models improve the prediction of binding motifs beyond first order
title_sort	bayesian markov models improve the prediction of binding motifs beyond first order
topic	Methods and Benchmark Surveys
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8057495/ https://www.ncbi.nlm.nih.gov/pubmed/33928244 http://dx.doi.org/10.1093/nargab/lqab026
work_keys_str_mv	AT gewanwan bayesianmarkovmodelsimprovethepredictionofbindingmotifsbeyondfirstorder AT meiermarkus bayesianmarkovmodelsimprovethepredictionofbindingmotifsbeyondfirstorder AT rothchristian bayesianmarkovmodelsimprovethepredictionofbindingmotifsbeyondfirstorder AT sodingjohannes bayesianmarkovmodelsimprovethepredictionofbindingmotifsbeyondfirstorder

Bayesian Markov models improve the prediction of binding motifs beyond first order

Ejemplares similares