Cargando…
Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes
BACKGROUND: Aptamer-protein interacting pairs play a variety of physiological functions and therapeutic potentials in organisms. Rapidly and effectively predicting aptamer-protein interacting pairs is significant to design aptamers binding to certain interested proteins, which will give insight into...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4888498/ https://www.ncbi.nlm.nih.gov/pubmed/27245069 http://dx.doi.org/10.1186/s12859-016-1087-5 |
_version_ | 1782434861229801472 |
---|---|
author | Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing |
author_facet | Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing |
author_sort | Zhang, Lina |
collection | PubMed |
description | BACKGROUND: Aptamer-protein interacting pairs play a variety of physiological functions and therapeutic potentials in organisms. Rapidly and effectively predicting aptamer-protein interacting pairs is significant to design aptamers binding to certain interested proteins, which will give insight into understanding mechanisms of aptamer-protein interacting pairs and developing aptamer-based therapies. RESULTS: In this study, an ensemble method is presented to predict aptamer-protein interacting pairs with hybrid features. The features for aptamers are extracted from Pseudo K-tuple Nucleotide Composition (PseKNC) while the features for proteins incorporate Discrete Cosine Transformation (DCT), disorder information, and bi-gram Position Specific Scoring Matrix (PSSM). We investigate predictive capabilities of various feature spaces. The proposed ensemble method obtains the best performance with Youden’s Index of 0.380, using the hybrid feature space of PseKNC, DCT, bi-gram PSSM, and disorder information by 10-fold cross validation. The Relief-Incremental Feature Selection (IFS) method is adopted to obtain the optimal feature set. Based on the optimal feature set, the proposed method achieves a balanced performance with a sensitivity of 0.753 and a specificity of 0.725 on the training dataset, which indicates that this method can solve the imbalanced data problem effectively. To evaluate the prediction performance objectively, an independent testing dataset is used to evaluate the proposed method. Encouragingly, our proposed method performs better than previous study with a sensitivity of 0.738 and a Youden’s Index of 0.451. CONCLUSIONS: These results suggest that the proposed method can be a potential candidate for aptamer-protein interacting pair prediction, which may contribute to finding novel aptamer-protein interacting pairs and understanding the relationship between aptamers and proteins. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1087-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4888498 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-48884982016-06-08 Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing BMC Bioinformatics Research Article BACKGROUND: Aptamer-protein interacting pairs play a variety of physiological functions and therapeutic potentials in organisms. Rapidly and effectively predicting aptamer-protein interacting pairs is significant to design aptamers binding to certain interested proteins, which will give insight into understanding mechanisms of aptamer-protein interacting pairs and developing aptamer-based therapies. RESULTS: In this study, an ensemble method is presented to predict aptamer-protein interacting pairs with hybrid features. The features for aptamers are extracted from Pseudo K-tuple Nucleotide Composition (PseKNC) while the features for proteins incorporate Discrete Cosine Transformation (DCT), disorder information, and bi-gram Position Specific Scoring Matrix (PSSM). We investigate predictive capabilities of various feature spaces. The proposed ensemble method obtains the best performance with Youden’s Index of 0.380, using the hybrid feature space of PseKNC, DCT, bi-gram PSSM, and disorder information by 10-fold cross validation. The Relief-Incremental Feature Selection (IFS) method is adopted to obtain the optimal feature set. Based on the optimal feature set, the proposed method achieves a balanced performance with a sensitivity of 0.753 and a specificity of 0.725 on the training dataset, which indicates that this method can solve the imbalanced data problem effectively. To evaluate the prediction performance objectively, an independent testing dataset is used to evaluate the proposed method. Encouragingly, our proposed method performs better than previous study with a sensitivity of 0.738 and a Youden’s Index of 0.451. CONCLUSIONS: These results suggest that the proposed method can be a potential candidate for aptamer-protein interacting pair prediction, which may contribute to finding novel aptamer-protein interacting pairs and understanding the relationship between aptamers and proteins. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1087-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-05-31 /pmc/articles/PMC4888498/ /pubmed/27245069 http://dx.doi.org/10.1186/s12859-016-1087-5 Text en © Zhang et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Song, Qing Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title | Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title_full | Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title_fullStr | Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title_full_unstemmed | Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title_short | Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
title_sort | prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4888498/ https://www.ncbi.nlm.nih.gov/pubmed/27245069 http://dx.doi.org/10.1186/s12859-016-1087-5 |
work_keys_str_mv | AT zhanglina predictionofaptamerproteininteractingpairsusinganensembleclassifierincombinationwithvariousproteinsequenceattributes AT zhangchengjin predictionofaptamerproteininteractingpairsusinganensembleclassifierincombinationwithvariousproteinsequenceattributes AT gaorui predictionofaptamerproteininteractingpairsusinganensembleclassifierincombinationwithvariousproteinsequenceattributes AT yangruntao predictionofaptamerproteininteractingpairsusinganensembleclassifierincombinationwithvariousproteinsequenceattributes AT songqing predictionofaptamerproteininteractingpairsusinganensembleclassifierincombinationwithvariousproteinsequenceattributes |