Cargando…
Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We d...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2900294/ https://www.ncbi.nlm.nih.gov/pubmed/20628623 http://dx.doi.org/10.1371/journal.pcbi.1000834 |
_version_ | 1782183619909910528 |
---|---|
author | Reynolds, Sheila M. Bilmes, Jeff A. Noble, William Stafford |
author_facet | Reynolds, Sheila M. Bilmes, Jeff A. Noble, William Stafford |
author_sort | Reynolds, Sheila M. |
collection | PubMed |
description | DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model. |
format | Text |
id | pubmed-2900294 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-29002942010-07-13 Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens Reynolds, Sheila M. Bilmes, Jeff A. Noble, William Stafford PLoS Comput Biol Research Article DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model. Public Library of Science 2010-07-08 /pmc/articles/PMC2900294/ /pubmed/20628623 http://dx.doi.org/10.1371/journal.pcbi.1000834 Text en Reynolds et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Reynolds, Sheila M. Bilmes, Jeff A. Noble, William Stafford Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens |
title | Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
|
title_full | Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
|
title_fullStr | Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
|
title_full_unstemmed | Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
|
title_short | Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens
|
title_sort | learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in saccharomyces cerevisiae and homo sapiens |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2900294/ https://www.ncbi.nlm.nih.gov/pubmed/20628623 http://dx.doi.org/10.1371/journal.pcbi.1000834 |
work_keys_str_mv | AT reynoldssheilam learningaweightedsequencemodelofthenucleosomecoreandlinkeryieldsmoreaccuratepredictionsinsaccharomycescerevisiaeandhomosapiens AT bilmesjeffa learningaweightedsequencemodelofthenucleosomecoreandlinkeryieldsmoreaccuratepredictionsinsaccharomycescerevisiaeandhomosapiens AT noblewilliamstafford learningaweightedsequencemodelofthenucleosomecoreandlinkeryieldsmoreaccuratepredictionsinsaccharomycescerevisiaeandhomosapiens |