Cargando…

Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach

Recombination presents a nonuniform distribution across the genome. Genomic regions that present relatively higher frequencies of recombination are called hotspots while those with relatively lower frequencies of recombination are recombination coldspots. Therefore, the identification of hotspots/co...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Bingquan, Liu, Yumeng, Huang, Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5015011/
https://www.ncbi.nlm.nih.gov/pubmed/27648451
http://dx.doi.org/10.1155/2016/8527435
_version_ 1782452364490309632
author Liu, Bingquan
Liu, Yumeng
Huang, Dong
author_facet Liu, Bingquan
Liu, Yumeng
Huang, Dong
author_sort Liu, Bingquan
collection PubMed
description Recombination presents a nonuniform distribution across the genome. Genomic regions that present relatively higher frequencies of recombination are called hotspots while those with relatively lower frequencies of recombination are recombination coldspots. Therefore, the identification of hotspots/coldspots could provide useful information for the study of the mechanism of recombination. In this study, a new computational predictor called SVM-EL was proposed to identify hotspots/coldspots across the yeast genome. It combined Support Vector Machines (SVMs) and Ensemble Learning (EL) based on three features including basic kmer (Kmer), dinucleotide-based auto-cross covariance (DACC), and pseudo dinucleotide composition (PseDNC). These features are able to incorporate the nucleic acid composition and their order information into the predictor. The proposed SVM-EL achieves an accuracy of 82.89% on a widely used benchmark dataset, which outperforms some related methods.
format Online
Article
Text
id pubmed-5015011
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-50150112016-09-19 Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach Liu, Bingquan Liu, Yumeng Huang, Dong Biomed Res Int Research Article Recombination presents a nonuniform distribution across the genome. Genomic regions that present relatively higher frequencies of recombination are called hotspots while those with relatively lower frequencies of recombination are recombination coldspots. Therefore, the identification of hotspots/coldspots could provide useful information for the study of the mechanism of recombination. In this study, a new computational predictor called SVM-EL was proposed to identify hotspots/coldspots across the yeast genome. It combined Support Vector Machines (SVMs) and Ensemble Learning (EL) based on three features including basic kmer (Kmer), dinucleotide-based auto-cross covariance (DACC), and pseudo dinucleotide composition (PseDNC). These features are able to incorporate the nucleic acid composition and their order information into the predictor. The proposed SVM-EL achieves an accuracy of 82.89% on a widely used benchmark dataset, which outperforms some related methods. Hindawi Publishing Corporation 2016 2016-08-25 /pmc/articles/PMC5015011/ /pubmed/27648451 http://dx.doi.org/10.1155/2016/8527435 Text en Copyright © 2016 Bingquan Liu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Bingquan
Liu, Yumeng
Huang, Dong
Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title_full Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title_fullStr Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title_full_unstemmed Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title_short Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach
title_sort recombination hotspot/coldspot identification combining three different pseudocomponents via an ensemble learning approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5015011/
https://www.ncbi.nlm.nih.gov/pubmed/27648451
http://dx.doi.org/10.1155/2016/8527435
work_keys_str_mv AT liubingquan recombinationhotspotcoldspotidentificationcombiningthreedifferentpseudocomponentsviaanensemblelearningapproach
AT liuyumeng recombinationhotspotcoldspotidentificationcombiningthreedifferentpseudocomponentsviaanensemblelearningapproach
AT huangdong recombinationhotspotcoldspotidentificationcombiningthreedifferentpseudocomponentsviaanensemblelearningapproach