Cargando…

Automatic detection of exonic splicing enhancers (ESEs) using SVMs

BACKGROUND: Exonic splicing enhancers (ESEs) activate nearby splice sites and promote the inclusion (vs. exclusion) of exons in which they reside, while being a binding site for SR proteins. To study the impact of ESEs on alternative splicing it would be useful to have a possibility to detect them i...

Descripción completa

Detalles Bibliográficos
Autores principales: Mersch, Britta, Gepperth, Alexander, Suhai, Sándor, Hotz-Wagenblatt, Agnes
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2567995/
https://www.ncbi.nlm.nih.gov/pubmed/18783607
http://dx.doi.org/10.1186/1471-2105-9-369
_version_ 1782160025463029760
author Mersch, Britta
Gepperth, Alexander
Suhai, Sándor
Hotz-Wagenblatt, Agnes
author_facet Mersch, Britta
Gepperth, Alexander
Suhai, Sándor
Hotz-Wagenblatt, Agnes
author_sort Mersch, Britta
collection PubMed
description BACKGROUND: Exonic splicing enhancers (ESEs) activate nearby splice sites and promote the inclusion (vs. exclusion) of exons in which they reside, while being a binding site for SR proteins. To study the impact of ESEs on alternative splicing it would be useful to have a possibility to detect them in exons. Identifying SR protein-binding sites in human DNA sequences by machine learning techniques is a formidable task, since the exon sequences are also constrained by their functional role in coding for proteins. RESULTS: The choice of training examples needed for machine learning approaches is difficult since there are only few exact locations of human ESEs described in the literature which could be considered as positive examples. Additionally, it is unclear which sequences are suitable as negative examples. Therefore, we developed a motif-oriented data-extraction method that extracts exon sequences around experimentally or theoretically determined ESE patterns. Positive examples are restricted by heuristics based on known properties of ESEs, e.g. location in the vicinity of a splice site, whereas negative examples are taken in the same way from the middle of long exons. We show that a suitably chosen SVM using optimized sequence kernels (e.g., combined oligo kernel) can extract meaningful properties from these training examples. Once the classifier is trained, every potential ESE sequence can be passed to the SVM for verification. Using SVMs with the combined oligo kernel yields a high accuracy of about 90 percent and well interpretable parameters. CONCLUSION: The motif-oriented data-extraction method seems to produce consistent training and test data leading to good classification rates and thus allows verification of potential ESE motifs. The best results were obtained using an SVM with the combined oligo kernel, while oligo kernels with oligomers of a certain length could be used to extract relevant features.
format Text
id pubmed-2567995
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25679952008-10-16 Automatic detection of exonic splicing enhancers (ESEs) using SVMs Mersch, Britta Gepperth, Alexander Suhai, Sándor Hotz-Wagenblatt, Agnes BMC Bioinformatics Research Article BACKGROUND: Exonic splicing enhancers (ESEs) activate nearby splice sites and promote the inclusion (vs. exclusion) of exons in which they reside, while being a binding site for SR proteins. To study the impact of ESEs on alternative splicing it would be useful to have a possibility to detect them in exons. Identifying SR protein-binding sites in human DNA sequences by machine learning techniques is a formidable task, since the exon sequences are also constrained by their functional role in coding for proteins. RESULTS: The choice of training examples needed for machine learning approaches is difficult since there are only few exact locations of human ESEs described in the literature which could be considered as positive examples. Additionally, it is unclear which sequences are suitable as negative examples. Therefore, we developed a motif-oriented data-extraction method that extracts exon sequences around experimentally or theoretically determined ESE patterns. Positive examples are restricted by heuristics based on known properties of ESEs, e.g. location in the vicinity of a splice site, whereas negative examples are taken in the same way from the middle of long exons. We show that a suitably chosen SVM using optimized sequence kernels (e.g., combined oligo kernel) can extract meaningful properties from these training examples. Once the classifier is trained, every potential ESE sequence can be passed to the SVM for verification. Using SVMs with the combined oligo kernel yields a high accuracy of about 90 percent and well interpretable parameters. CONCLUSION: The motif-oriented data-extraction method seems to produce consistent training and test data leading to good classification rates and thus allows verification of potential ESE motifs. The best results were obtained using an SVM with the combined oligo kernel, while oligo kernels with oligomers of a certain length could be used to extract relevant features. BioMed Central 2008-09-10 /pmc/articles/PMC2567995/ /pubmed/18783607 http://dx.doi.org/10.1186/1471-2105-9-369 Text en Copyright © 2008 Mersch et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Mersch, Britta
Gepperth, Alexander
Suhai, Sándor
Hotz-Wagenblatt, Agnes
Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title_full Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title_fullStr Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title_full_unstemmed Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title_short Automatic detection of exonic splicing enhancers (ESEs) using SVMs
title_sort automatic detection of exonic splicing enhancers (eses) using svms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2567995/
https://www.ncbi.nlm.nih.gov/pubmed/18783607
http://dx.doi.org/10.1186/1471-2105-9-369
work_keys_str_mv AT merschbritta automaticdetectionofexonicsplicingenhancersesesusingsvms
AT gepperthalexander automaticdetectionofexonicsplicingenhancersesesusingsvms
AT suhaisandor automaticdetectionofexonicsplicingenhancersesesusingsvms
AT hotzwagenblattagnes automaticdetectionofexonicsplicingenhancersesesusingsvms