Cargando…

Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding

Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jinsen, Sagendorf, Jared M., Chiu, Tsu-Pei, Pasi, Marco, Perez, Alberto, Rohs, Remo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728407/
https://www.ncbi.nlm.nih.gov/pubmed/29165643
http://dx.doi.org/10.1093/nar/gkx1145
_version_ 1783286025023389696
author Li, Jinsen
Sagendorf, Jared M.
Chiu, Tsu-Pei
Pasi, Marco
Perez, Alberto
Rohs, Remo
author_facet Li, Jinsen
Sagendorf, Jared M.
Chiu, Tsu-Pei
Pasi, Marco
Perez, Alberto
Rohs, Remo
author_sort Li, Jinsen
collection PubMed
description Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information in these models improved their accuracy and interpretability. Previously, we developed a method for modeling DNA binding specificities based on DNA shape features extracted from Monte Carlo (MC) simulations. Prediction accuracies of our models, however, have not yet been compared to accuracies of models incorporating DNA shape information extracted from X-ray crystallography (XRC) data or Molecular Dynamics (MD) simulations. Here, we integrated DNA shape information extracted from MC or MD simulations and XRC data into predictive models of TF binding and compared their performance. Models that incorporated structural information consistently showed improved performance over sequence-based models regardless of data source. Furthermore, we derived and validated nine additional DNA shape features beyond our original set of four features. The expanded repertoire of 13 distinct DNA shape features, including six intra-base pair and six inter-base pair parameters and minor groove width, is available in our R/Bioconductor package DNAshapeR and enables a comprehensive structural description of the double helix on a genome-wide scale.
format Online
Article
Text
id pubmed-5728407
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57284072017-12-18 Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding Li, Jinsen Sagendorf, Jared M. Chiu, Tsu-Pei Pasi, Marco Perez, Alberto Rohs, Remo Nucleic Acids Res Genomics Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information in these models improved their accuracy and interpretability. Previously, we developed a method for modeling DNA binding specificities based on DNA shape features extracted from Monte Carlo (MC) simulations. Prediction accuracies of our models, however, have not yet been compared to accuracies of models incorporating DNA shape information extracted from X-ray crystallography (XRC) data or Molecular Dynamics (MD) simulations. Here, we integrated DNA shape information extracted from MC or MD simulations and XRC data into predictive models of TF binding and compared their performance. Models that incorporated structural information consistently showed improved performance over sequence-based models regardless of data source. Furthermore, we derived and validated nine additional DNA shape features beyond our original set of four features. The expanded repertoire of 13 distinct DNA shape features, including six intra-base pair and six inter-base pair parameters and minor groove width, is available in our R/Bioconductor package DNAshapeR and enables a comprehensive structural description of the double helix on a genome-wide scale. Oxford University Press 2017-12-15 2017-11-20 /pmc/articles/PMC5728407/ /pubmed/29165643 http://dx.doi.org/10.1093/nar/gkx1145 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Genomics
Li, Jinsen
Sagendorf, Jared M.
Chiu, Tsu-Pei
Pasi, Marco
Perez, Alberto
Rohs, Remo
Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title_full Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title_fullStr Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title_full_unstemmed Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title_short Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding
title_sort expanding the repertoire of dna shape features for genome-scale studies of transcription factor binding
topic Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5728407/
https://www.ncbi.nlm.nih.gov/pubmed/29165643
http://dx.doi.org/10.1093/nar/gkx1145
work_keys_str_mv AT lijinsen expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding
AT sagendorfjaredm expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding
AT chiutsupei expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding
AT pasimarco expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding
AT perezalberto expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding
AT rohsremo expandingtherepertoireofdnashapefeaturesforgenomescalestudiesoftranscriptionfactorbinding