Cargando…

Fast splice site detection using information content and feature reduction

BACKGROUND: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these me...

Descripción completa

Detalles Bibliográficos
Autores principales: Baten, AKMA, Halgamuge, SK, Chang, BCH
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638148/
https://www.ncbi.nlm.nih.gov/pubmed/19091031
http://dx.doi.org/10.1186/1471-2105-9-S12-S8
_version_ 1782164395826085888
author Baten, AKMA
Halgamuge, SK
Chang, BCH
author_facet Baten, AKMA
Halgamuge, SK
Chang, BCH
author_sort Baten, AKMA
collection PubMed
description BACKGROUND: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these methods are limited in terms of their long computation time when applied to whole genome sequence data. RESULTS: In this paper we propose a hybrid algorithm which combines several effective and informative input features with the state of the art support vector machine (SVM). To obtain the input features we employ information content method based on Shannon's information theory, Shapiro's score scheme, and Markovian probabilities. We also use a feature elimination scheme to reduce the less informative features from the input data. CONCLUSION: In this study we propose a new feature based splice site detection method that shows improved acceptor and donor splice site detection in DNA sequences when the performance is compared with various state of the art and well known methods.
format Text
id pubmed-2638148
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26381482009-02-24 Fast splice site detection using information content and feature reduction Baten, AKMA Halgamuge, SK Chang, BCH BMC Bioinformatics Research BACKGROUND: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these methods are limited in terms of their long computation time when applied to whole genome sequence data. RESULTS: In this paper we propose a hybrid algorithm which combines several effective and informative input features with the state of the art support vector machine (SVM). To obtain the input features we employ information content method based on Shannon's information theory, Shapiro's score scheme, and Markovian probabilities. We also use a feature elimination scheme to reduce the less informative features from the input data. CONCLUSION: In this study we propose a new feature based splice site detection method that shows improved acceptor and donor splice site detection in DNA sequences when the performance is compared with various state of the art and well known methods. BioMed Central 2008-12-12 /pmc/articles/PMC2638148/ /pubmed/19091031 http://dx.doi.org/10.1186/1471-2105-9-S12-S8 Text en Copyright © 2008 Baten et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Baten, AKMA
Halgamuge, SK
Chang, BCH
Fast splice site detection using information content and feature reduction
title Fast splice site detection using information content and feature reduction
title_full Fast splice site detection using information content and feature reduction
title_fullStr Fast splice site detection using information content and feature reduction
title_full_unstemmed Fast splice site detection using information content and feature reduction
title_short Fast splice site detection using information content and feature reduction
title_sort fast splice site detection using information content and feature reduction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638148/
https://www.ncbi.nlm.nih.gov/pubmed/19091031
http://dx.doi.org/10.1186/1471-2105-9-S12-S8
work_keys_str_mv AT batenakma fastsplicesitedetectionusinginformationcontentandfeaturereduction
AT halgamugesk fastsplicesitedetectionusinginformationcontentandfeaturereduction
AT changbch fastsplicesitedetectionusinginformationcontentandfeaturereduction