Cargando…

Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition

BACKGROUND: Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). Hotspots and coldspots are receiving increasing attention in research into the mechanism of meiotic recombination. However, predictin...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Tong, Weng, Jianhong, Sun, Xiao, Lu, Zuhong
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1463011/
https://www.ncbi.nlm.nih.gov/pubmed/16640774
http://dx.doi.org/10.1186/1471-2105-7-223
_version_ 1782127526604177408
author Zhou, Tong
Weng, Jianhong
Sun, Xiao
Lu, Zuhong
author_facet Zhou, Tong
Weng, Jianhong
Sun, Xiao
Lu, Zuhong
author_sort Zhou, Tong
collection PubMed
description BACKGROUND: Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). Hotspots and coldspots are receiving increasing attention in research into the mechanism of meiotic recombination. However, predicting hotspots and coldspots from DNA sequence information is still a challenging task. RESULTS: We present a novel method for classification of hot and cold ORFs located in hotspots and coldspots respectively in Saccharomyces cerevisiae, using support vector machine (SVM), which relies on codon composition differences. This method has achieved a high classification accuracy of 85.0%. Since codon composition is a fusion of codon usage bias and amino acid composition signals, the ability of these two kinds of sequence attributes to discriminate hot ORFs from cold ORFs was also investigated separately. Our results indicate that neither codon usage bias nor amino acid composition taken separately performed as well as codon composition. Moreover, our SVM based method was applied to the full genome: We predicted the hot/cold ORFs from the yeast genome by using cutoffs of recombination rate. We found that the performance of our method for predicting cold ORFs is not as good as that for predicting hot ORFs. Besides, we also observed a considerable correlation between meiotic recombination rate and amino acid composition of certain residues, which probably reflects the structural and functional dissimilarity between the hot and cold groups. CONCLUSION: We have introduced a SVM-based novel method to discriminate hot ORFs from cold ones. Applying codon composition as sequence attributes, we have achieved a high classification accuracy, which suggests that codon composition has strong potential to be used as sequence attributes in the prediction of hot and cold ORFs.
format Text
id pubmed-1463011
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14630112006-06-07 Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition Zhou, Tong Weng, Jianhong Sun, Xiao Lu, Zuhong BMC Bioinformatics Methodology Article BACKGROUND: Meiotic double-strand breaks occur at relatively high frequencies in some genomic regions (hotspots) and relatively low frequencies in others (coldspots). Hotspots and coldspots are receiving increasing attention in research into the mechanism of meiotic recombination. However, predicting hotspots and coldspots from DNA sequence information is still a challenging task. RESULTS: We present a novel method for classification of hot and cold ORFs located in hotspots and coldspots respectively in Saccharomyces cerevisiae, using support vector machine (SVM), which relies on codon composition differences. This method has achieved a high classification accuracy of 85.0%. Since codon composition is a fusion of codon usage bias and amino acid composition signals, the ability of these two kinds of sequence attributes to discriminate hot ORFs from cold ORFs was also investigated separately. Our results indicate that neither codon usage bias nor amino acid composition taken separately performed as well as codon composition. Moreover, our SVM based method was applied to the full genome: We predicted the hot/cold ORFs from the yeast genome by using cutoffs of recombination rate. We found that the performance of our method for predicting cold ORFs is not as good as that for predicting hot ORFs. Besides, we also observed a considerable correlation between meiotic recombination rate and amino acid composition of certain residues, which probably reflects the structural and functional dissimilarity between the hot and cold groups. CONCLUSION: We have introduced a SVM-based novel method to discriminate hot ORFs from cold ones. Applying codon composition as sequence attributes, we have achieved a high classification accuracy, which suggests that codon composition has strong potential to be used as sequence attributes in the prediction of hot and cold ORFs. BioMed Central 2006-04-26 /pmc/articles/PMC1463011/ /pubmed/16640774 http://dx.doi.org/10.1186/1471-2105-7-223 Text en Copyright © 2006 Zhou et al; licensee BioMed Central Ltd.
spellingShingle Methodology Article
Zhou, Tong
Weng, Jianhong
Sun, Xiao
Lu, Zuhong
Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title_full Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title_fullStr Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title_full_unstemmed Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title_short Support vector machine for classification of meiotic recombination hotspots and coldspots in Saccharomyces cerevisiae based on codon composition
title_sort support vector machine for classification of meiotic recombination hotspots and coldspots in saccharomyces cerevisiae based on codon composition
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1463011/
https://www.ncbi.nlm.nih.gov/pubmed/16640774
http://dx.doi.org/10.1186/1471-2105-7-223
work_keys_str_mv AT zhoutong supportvectormachineforclassificationofmeioticrecombinationhotspotsandcoldspotsinsaccharomycescerevisiaebasedoncodoncomposition
AT wengjianhong supportvectormachineforclassificationofmeioticrecombinationhotspotsandcoldspotsinsaccharomycescerevisiaebasedoncodoncomposition
AT sunxiao supportvectormachineforclassificationofmeioticrecombinationhotspotsandcoldspotsinsaccharomycescerevisiaebasedoncodoncomposition
AT luzuhong supportvectormachineforclassificationofmeioticrecombinationhotspotsandcoldspotsinsaccharomycescerevisiaebasedoncodoncomposition