Cargando…

VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost

Vesicular transport proteins are related to many human diseases, and they threaten human health when they undergo pathological changes. Protein function prediction has been one of the most in-depth topics in bioinformatics. In this work, we developed a useful tool to identify vesicular transport pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Gong, Yue, Dong, Benzhi, Zhang, Zixiao, Zhai, Yixiao, Gao, Bo, Zhang, Tianjiao, Zhang, Jingyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762342/
https://www.ncbi.nlm.nih.gov/pubmed/35047020
http://dx.doi.org/10.3389/fgene.2021.808856
_version_ 1784633742532280320
author Gong, Yue
Dong, Benzhi
Zhang, Zixiao
Zhai, Yixiao
Gao, Bo
Zhang, Tianjiao
Zhang, Jingyu
author_facet Gong, Yue
Dong, Benzhi
Zhang, Zixiao
Zhai, Yixiao
Gao, Bo
Zhang, Tianjiao
Zhang, Jingyu
author_sort Gong, Yue
collection PubMed
description Vesicular transport proteins are related to many human diseases, and they threaten human health when they undergo pathological changes. Protein function prediction has been one of the most in-depth topics in bioinformatics. In this work, we developed a useful tool to identify vesicular transport proteins. Our strategy is to extract transition probability composition, autocovariance transformation and other information from the position-specific scoring matrix as feature vectors. EditedNearesNeighbours (ENN) is used to address the imbalance of the data set, and the Max-Relevance-Max-Distance (MRMD) algorithm is adopted to reduce the dimension of the feature vector. We used 5-fold cross-validation and independent test sets to evaluate our model. On the test set, VTP-Identifier presented a higher performance compared with GRU. The accuracy, Matthew’s correlation coefficient (MCC) and area under the ROC curve (AUC) were 83.6%, 0.531 and 0.873, respectively.
format Online
Article
Text
id pubmed-8762342
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87623422022-01-18 VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost Gong, Yue Dong, Benzhi Zhang, Zixiao Zhai, Yixiao Gao, Bo Zhang, Tianjiao Zhang, Jingyu Front Genet Genetics Vesicular transport proteins are related to many human diseases, and they threaten human health when they undergo pathological changes. Protein function prediction has been one of the most in-depth topics in bioinformatics. In this work, we developed a useful tool to identify vesicular transport proteins. Our strategy is to extract transition probability composition, autocovariance transformation and other information from the position-specific scoring matrix as feature vectors. EditedNearesNeighbours (ENN) is used to address the imbalance of the data set, and the Max-Relevance-Max-Distance (MRMD) algorithm is adopted to reduce the dimension of the feature vector. We used 5-fold cross-validation and independent test sets to evaluate our model. On the test set, VTP-Identifier presented a higher performance compared with GRU. The accuracy, Matthew’s correlation coefficient (MCC) and area under the ROC curve (AUC) were 83.6%, 0.531 and 0.873, respectively. Frontiers Media S.A. 2022-01-03 /pmc/articles/PMC8762342/ /pubmed/35047020 http://dx.doi.org/10.3389/fgene.2021.808856 Text en Copyright © 2022 Gong, Dong, Zhang, Zhai, Gao, Zhang and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Gong, Yue
Dong, Benzhi
Zhang, Zixiao
Zhai, Yixiao
Gao, Bo
Zhang, Tianjiao
Zhang, Jingyu
VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title_full VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title_fullStr VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title_full_unstemmed VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title_short VTP-Identifier: Vesicular Transport Proteins Identification Based on PSSM Profiles and XGBoost
title_sort vtp-identifier: vesicular transport proteins identification based on pssm profiles and xgboost
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762342/
https://www.ncbi.nlm.nih.gov/pubmed/35047020
http://dx.doi.org/10.3389/fgene.2021.808856
work_keys_str_mv AT gongyue vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT dongbenzhi vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT zhangzixiao vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT zhaiyixiao vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT gaobo vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT zhangtianjiao vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost
AT zhangjingyu vtpidentifiervesiculartransportproteinsidentificationbasedonpssmprofilesandxgboost