Cargando…
Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae
Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined re...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8276760/ https://www.ncbi.nlm.nih.gov/pubmed/34267784 http://dx.doi.org/10.3389/fgene.2021.705038 |
_version_ | 1783721962004021248 |
---|---|
author | Liu, Guoqing Song, Shuangjian Zhang, Qiguo Dong, Biyu Sun, Yu Liu, Guojun Zhao, Xiujuan |
author_facet | Liu, Guoqing Song, Shuangjian Zhang, Qiguo Dong, Biyu Sun, Yu Liu, Guojun Zhao, Xiujuan |
author_sort | Liu, Guoqing |
collection | PubMed |
description | Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy. |
format | Online Article Text |
id | pubmed-8276760 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82767602021-07-14 Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae Liu, Guoqing Song, Shuangjian Zhang, Qiguo Dong, Biyu Sun, Yu Liu, Guojun Zhao, Xiujuan Front Genet Genetics Characterization and identification of recombination hotspots provide important insights into the mechanism of recombination and genome evolution. In contrast with existing sequence-based models for predicting recombination hotspots which were defined in a ORF-based manner, here, we first defined recombination hot/cold spots based on public high-resolution Spo11-oligo-seq data, then characterized them in terms of DNA sequence and epigenetic marks, and finally presented classifiers to identify hotspots. We found that, in addition to some previously discovered DNA-based features like GC-skew, recombination hotspots in yeast can also be characterized by some remarkable features associated with DNA physical properties and shape. More importantly, by using DNA-based features and several epigenetic marks, we built several classifiers to discriminate hotspots from coldspots, and found that SVM classifier performs the best with an accuracy of ∼92%, which is also the highest among the models in comparison. Feature importance analysis combined with prediction results show that epigenetic marks and variation of sequence-based features along the hotspots contribute dominantly to hotspot identification. By using incremental feature selection method, an optimal feature subset that consists of much less features was obtained without sacrificing prediction accuracy. Frontiers Media S.A. 2021-06-29 /pmc/articles/PMC8276760/ /pubmed/34267784 http://dx.doi.org/10.3389/fgene.2021.705038 Text en Copyright © 2021 Liu, Song, Zhang, Dong, Sun, Liu and Zhao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Liu, Guoqing Song, Shuangjian Zhang, Qiguo Dong, Biyu Sun, Yu Liu, Guojun Zhao, Xiujuan Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title | Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_full | Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_fullStr | Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_full_unstemmed | Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_short | Epigenetic Marks and Variation of Sequence-Based Information Along Genomic Regions Are Predictive of Recombination Hot/Cold Spots in Saccharomyces cerevisiae |
title_sort | epigenetic marks and variation of sequence-based information along genomic regions are predictive of recombination hot/cold spots in saccharomyces cerevisiae |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8276760/ https://www.ncbi.nlm.nih.gov/pubmed/34267784 http://dx.doi.org/10.3389/fgene.2021.705038 |
work_keys_str_mv | AT liuguoqing epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT songshuangjian epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT zhangqiguo epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT dongbiyu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT sunyu epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT liuguojun epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae AT zhaoxiujuan epigeneticmarksandvariationofsequencebasedinformationalonggenomicregionsarepredictiveofrecombinationhotcoldspotsinsaccharomycescerevisiae |