Cargando…
A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction
BACKGROUND: Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3751940/ https://www.ncbi.nlm.nih.gov/pubmed/24058397 http://dx.doi.org/10.1371/journal.pone.0063145 |
_version_ | 1782281707279351808 |
---|---|
author | Öztürk, Orkun Aksaç, Alper Elsheikh, Abdallah Özyer, Tansel Alhajj, Reda |
author_facet | Öztürk, Orkun Aksaç, Alper Elsheikh, Abdallah Özyer, Tansel Alhajj, Reda |
author_sort | Öztürk, Orkun |
collection | PubMed |
description | BACKGROUND: Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors) against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space) turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. RESULTS: We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs). We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. CONCLUSION: Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. SOFTWARE AVAILABILITY: The software is available at http://ozyer.etu.edu.tr/c-fs-svm.rar. The software can be downloaded at esnag.etu.edu.tr/software/hiv_cleavage_site_prediction.rar; you will find a readme file which explains how to set the software in order to work. |
format | Online Article Text |
id | pubmed-3751940 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37519402013-09-20 A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction Öztürk, Orkun Aksaç, Alper Elsheikh, Abdallah Özyer, Tansel Alhajj, Reda PLoS One Research Article BACKGROUND: Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors) against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space) turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. RESULTS: We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs). We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. CONCLUSION: Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. SOFTWARE AVAILABILITY: The software is available at http://ozyer.etu.edu.tr/c-fs-svm.rar. The software can be downloaded at esnag.etu.edu.tr/software/hiv_cleavage_site_prediction.rar; you will find a readme file which explains how to set the software in order to work. Public Library of Science 2013-08-23 /pmc/articles/PMC3751940/ /pubmed/24058397 http://dx.doi.org/10.1371/journal.pone.0063145 Text en © 2013 Öztürk et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Öztürk, Orkun Aksaç, Alper Elsheikh, Abdallah Özyer, Tansel Alhajj, Reda A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title | A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title_full | A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title_fullStr | A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title_full_unstemmed | A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title_short | A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction |
title_sort | consistency-based feature selection method allied with linear svms for hiv-1 protease cleavage site prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3751940/ https://www.ncbi.nlm.nih.gov/pubmed/24058397 http://dx.doi.org/10.1371/journal.pone.0063145 |
work_keys_str_mv | AT ozturkorkun aconsistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT aksacalper aconsistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT elsheikhabdallah aconsistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT ozyertansel aconsistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT alhajjreda aconsistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT ozturkorkun consistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT aksacalper consistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT elsheikhabdallah consistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT ozyertansel consistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction AT alhajjreda consistencybasedfeatureselectionmethodalliedwithlinearsvmsforhiv1proteasecleavagesiteprediction |