Cargando…

Profiled support vector machines for antisense oligonucleotide efficacy prediction

BACKGROUND: This paper presents the use of Support Vector Machines (SVMs) for prediction and analysis of antisense oligonucleotide (AO) efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is cru...

Descripción completa

Detalles Bibliográficos
Autores principales: Camps-Valls, Gustavo, Chalk, Alistair M, Serrano-López, Antonio J, Martín-Guerrero, José D, Sonnhammer, Erik LL
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC526382/
https://www.ncbi.nlm.nih.gov/pubmed/15383156
http://dx.doi.org/10.1186/1471-2105-5-135
_version_ 1782121938298077184
author Camps-Valls, Gustavo
Chalk, Alistair M
Serrano-López, Antonio J
Martín-Guerrero, José D
Sonnhammer, Erik LL
author_facet Camps-Valls, Gustavo
Chalk, Alistair M
Serrano-López, Antonio J
Martín-Guerrero, José D
Sonnhammer, Erik LL
author_sort Camps-Valls, Gustavo
collection PubMed
description BACKGROUND: This paper presents the use of Support Vector Machines (SVMs) for prediction and analysis of antisense oligonucleotide (AO) efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1) feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE), and (2) AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to different parts of the training data to focus the training on the most important regions. RESULTS: In the first stage, the SVM-RFE technique was most efficient and robust in the presence of low number of samples and high input space dimension. This method yielded an optimal subset of 14 representative features, which were all related to energy and sequence motifs. The second stage evaluated the performance of the predictors (overall correlation coefficient between observed and predicted efficacy, r; mean error, ME; and root-mean-square-error, RMSE) using 8-fold and minus-one-RNA cross-validation methods. The profiled SVM produced the best results (r = 0.44, ME = 0.022, and RMSE= 0.278) and predicted high (>75% inhibition of gene expression) and low efficacy (<25%) AOs with a success rate of 83.3% and 82.9%, respectively, which is better than by previous approaches. A web server for AO prediction is available online at . CONCLUSIONS: The SVM approach is well suited to the AO prediction problem, and yields a prediction accuracy superior to previous methods. The profiled SVM was found to perform better than the standard SVM, suggesting that it could lead to improvements in other prediction problems as well.
format Text
id pubmed-526382
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5263822004-11-10 Profiled support vector machines for antisense oligonucleotide efficacy prediction Camps-Valls, Gustavo Chalk, Alistair M Serrano-López, Antonio J Martín-Guerrero, José D Sonnhammer, Erik LL BMC Bioinformatics Research Article BACKGROUND: This paper presents the use of Support Vector Machines (SVMs) for prediction and analysis of antisense oligonucleotide (AO) efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1) feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE), and (2) AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to different parts of the training data to focus the training on the most important regions. RESULTS: In the first stage, the SVM-RFE technique was most efficient and robust in the presence of low number of samples and high input space dimension. This method yielded an optimal subset of 14 representative features, which were all related to energy and sequence motifs. The second stage evaluated the performance of the predictors (overall correlation coefficient between observed and predicted efficacy, r; mean error, ME; and root-mean-square-error, RMSE) using 8-fold and minus-one-RNA cross-validation methods. The profiled SVM produced the best results (r = 0.44, ME = 0.022, and RMSE= 0.278) and predicted high (>75% inhibition of gene expression) and low efficacy (<25%) AOs with a success rate of 83.3% and 82.9%, respectively, which is better than by previous approaches. A web server for AO prediction is available online at . CONCLUSIONS: The SVM approach is well suited to the AO prediction problem, and yields a prediction accuracy superior to previous methods. The profiled SVM was found to perform better than the standard SVM, suggesting that it could lead to improvements in other prediction problems as well. BioMed Central 2004-09-22 /pmc/articles/PMC526382/ /pubmed/15383156 http://dx.doi.org/10.1186/1471-2105-5-135 Text en Copyright © 2004 Camps-Valls et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Camps-Valls, Gustavo
Chalk, Alistair M
Serrano-López, Antonio J
Martín-Guerrero, José D
Sonnhammer, Erik LL
Profiled support vector machines for antisense oligonucleotide efficacy prediction
title Profiled support vector machines for antisense oligonucleotide efficacy prediction
title_full Profiled support vector machines for antisense oligonucleotide efficacy prediction
title_fullStr Profiled support vector machines for antisense oligonucleotide efficacy prediction
title_full_unstemmed Profiled support vector machines for antisense oligonucleotide efficacy prediction
title_short Profiled support vector machines for antisense oligonucleotide efficacy prediction
title_sort profiled support vector machines for antisense oligonucleotide efficacy prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC526382/
https://www.ncbi.nlm.nih.gov/pubmed/15383156
http://dx.doi.org/10.1186/1471-2105-5-135
work_keys_str_mv AT campsvallsgustavo profiledsupportvectormachinesforantisenseoligonucleotideefficacyprediction
AT chalkalistairm profiledsupportvectormachinesforantisenseoligonucleotideefficacyprediction
AT serranolopezantonioj profiledsupportvectormachinesforantisenseoligonucleotideefficacyprediction
AT martinguerrerojosed profiledsupportvectormachinesforantisenseoligonucleotideefficacyprediction
AT sonnhammererikll profiledsupportvectormachinesforantisenseoligonucleotideefficacyprediction