Cargando…

PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations

Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional...

Descripción completa

Detalles Bibliográficos
Autores principales: Auliah, Firda Nurul, Nilamyani, Andi Nur, Shoombuatong, Watshara, Alam, Md Ashad, Hasan, Md Mehedi, Kurata, Hiroyuki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924619/
https://www.ncbi.nlm.nih.gov/pubmed/33672741
http://dx.doi.org/10.3390/ijms22042120
_version_ 1783659127039328256
author Auliah, Firda Nurul
Nilamyani, Andi Nur
Shoombuatong, Watshara
Alam, Md Ashad
Hasan, Md Mehedi
Kurata, Hiroyuki
author_facet Auliah, Firda Nurul
Nilamyani, Andi Nur
Shoombuatong, Watshara
Alam, Md Ashad
Hasan, Md Mehedi
Kurata, Hiroyuki
author_sort Auliah, Firda Nurul
collection PubMed
description Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available.
format Online
Article
Text
id pubmed-7924619
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79246192021-03-03 PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations Auliah, Firda Nurul Nilamyani, Andi Nur Shoombuatong, Watshara Alam, Md Ashad Hasan, Md Mehedi Kurata, Hiroyuki Int J Mol Sci Article Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available. MDPI 2021-02-20 /pmc/articles/PMC7924619/ /pubmed/33672741 http://dx.doi.org/10.3390/ijms22042120 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Auliah, Firda Nurul
Nilamyani, Andi Nur
Shoombuatong, Watshara
Alam, Md Ashad
Hasan, Md Mehedi
Kurata, Hiroyuki
PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title_full PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title_fullStr PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title_full_unstemmed PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title_short PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
title_sort pup-fuse: prediction of protein pupylation sites by integrating multiple sequence representations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924619/
https://www.ncbi.nlm.nih.gov/pubmed/33672741
http://dx.doi.org/10.3390/ijms22042120
work_keys_str_mv AT auliahfirdanurul pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations
AT nilamyaniandinur pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations
AT shoombuatongwatshara pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations
AT alammdashad pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations
AT hasanmdmehedi pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations
AT kuratahiroyuki pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations