Cargando…

JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method

Different types of J-proteins perform distinct functions in chaperone processes and diseases development. Accurate identification of types of J-proteins will provide significant clues to reveal the mechanism of J-proteins and contribute to developing drugs for diseases. In this study, an ensemble pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lina, Zhang, Chengjin, Gao, Rui, Yang, Runtao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4637456/
https://www.ncbi.nlm.nih.gov/pubmed/26587542
http://dx.doi.org/10.1155/2015/705156
_version_ 1782399818294886400
author Zhang, Lina
Zhang, Chengjin
Gao, Rui
Yang, Runtao
author_facet Zhang, Lina
Zhang, Chengjin
Gao, Rui
Yang, Runtao
author_sort Zhang, Lina
collection PubMed
description Different types of J-proteins perform distinct functions in chaperone processes and diseases development. Accurate identification of types of J-proteins will provide significant clues to reveal the mechanism of J-proteins and contribute to developing drugs for diseases. In this study, an ensemble predictor called JPPRED for J-protein prediction is proposed with hybrid features, including split amino acid composition (SAAC), pseudo amino acid composition (PseAAC), and position specific scoring matrix (PSSM). To deal with the imbalanced benchmark dataset, the synthetic minority oversampling technique (SMOTE) and undersampling technique are applied. The average sensitivity of JPPRED based on above-mentioned individual feature spaces lies in the range of 0.744–0.851, indicating the discriminative power of these features. In addition, JPPRED yields the highest average sensitivity of 0.875 using the hybrid feature spaces of SAAC, PseAAC, and PSSM. Compared to individual base classifiers, JPPRED obtains more balanced and better performance for each type of J-proteins. To evaluate the prediction performance objectively, JPPRED is compared with previous study. Encouragingly, JPPRED obtains balanced performance for each type of J-proteins, which is significantly superior to that of the existing method. It is anticipated that JPPRED can be a potential candidate for J-protein prediction.
format Online
Article
Text
id pubmed-4637456
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-46374562015-11-19 JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method Zhang, Lina Zhang, Chengjin Gao, Rui Yang, Runtao Biomed Res Int Research Article Different types of J-proteins perform distinct functions in chaperone processes and diseases development. Accurate identification of types of J-proteins will provide significant clues to reveal the mechanism of J-proteins and contribute to developing drugs for diseases. In this study, an ensemble predictor called JPPRED for J-protein prediction is proposed with hybrid features, including split amino acid composition (SAAC), pseudo amino acid composition (PseAAC), and position specific scoring matrix (PSSM). To deal with the imbalanced benchmark dataset, the synthetic minority oversampling technique (SMOTE) and undersampling technique are applied. The average sensitivity of JPPRED based on above-mentioned individual feature spaces lies in the range of 0.744–0.851, indicating the discriminative power of these features. In addition, JPPRED yields the highest average sensitivity of 0.875 using the hybrid feature spaces of SAAC, PseAAC, and PSSM. Compared to individual base classifiers, JPPRED obtains more balanced and better performance for each type of J-proteins. To evaluate the prediction performance objectively, JPPRED is compared with previous study. Encouragingly, JPPRED obtains balanced performance for each type of J-proteins, which is significantly superior to that of the existing method. It is anticipated that JPPRED can be a potential candidate for J-protein prediction. Hindawi Publishing Corporation 2015 2015-10-26 /pmc/articles/PMC4637456/ /pubmed/26587542 http://dx.doi.org/10.1155/2015/705156 Text en Copyright © 2015 Lina Zhang et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Lina
Zhang, Chengjin
Gao, Rui
Yang, Runtao
JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title_full JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title_fullStr JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title_full_unstemmed JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title_short JPPRED: Prediction of Types of J-Proteins from Imbalanced Data Using an Ensemble Learning Method
title_sort jppred: prediction of types of j-proteins from imbalanced data using an ensemble learning method
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4637456/
https://www.ncbi.nlm.nih.gov/pubmed/26587542
http://dx.doi.org/10.1155/2015/705156
work_keys_str_mv AT zhanglina jppredpredictionoftypesofjproteinsfromimbalanceddatausinganensemblelearningmethod
AT zhangchengjin jppredpredictionoftypesofjproteinsfromimbalanceddatausinganensemblelearningmethod
AT gaorui jppredpredictionoftypesofjproteinsfromimbalanceddatausinganensemblelearningmethod
AT yangruntao jppredpredictionoftypesofjproteinsfromimbalanceddatausinganensemblelearningmethod