Cargando…
PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations
Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924619/ https://www.ncbi.nlm.nih.gov/pubmed/33672741 http://dx.doi.org/10.3390/ijms22042120 |
_version_ | 1783659127039328256 |
---|---|
author | Auliah, Firda Nurul Nilamyani, Andi Nur Shoombuatong, Watshara Alam, Md Ashad Hasan, Md Mehedi Kurata, Hiroyuki |
author_facet | Auliah, Firda Nurul Nilamyani, Andi Nur Shoombuatong, Watshara Alam, Md Ashad Hasan, Md Mehedi Kurata, Hiroyuki |
author_sort | Auliah, Firda Nurul |
collection | PubMed |
description | Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available. |
format | Online Article Text |
id | pubmed-7924619 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-79246192021-03-03 PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations Auliah, Firda Nurul Nilamyani, Andi Nur Shoombuatong, Watshara Alam, Md Ashad Hasan, Md Mehedi Kurata, Hiroyuki Int J Mol Sci Article Pupylation is a type of reversible post-translational modification of proteins, which plays a key role in the cellular function of microbial organisms. Several proteomics methods have been developed for the prediction and analysis of pupylated proteins and pupylation sites. However, the traditional experimental methods are laborious and time-consuming. Hence, computational algorithms are highly needed that can predict potential pupylation sites using sequence features. In this research, a new prediction model, PUP-Fuse, has been developed for pupylation site prediction by integrating multiple sequence representations. Meanwhile, we explored the five types of feature encoding approaches and three machine learning (ML) algorithms. In the final model, we integrated the successive ML scores using a linear regression model. The PUP-Fuse achieved a Mathew correlation value of 0.768 by a 10-fold cross-validation test. It also outperformed existing predictors in an independent test. The web server of the PUP-Fuse with curated datasets is freely available. MDPI 2021-02-20 /pmc/articles/PMC7924619/ /pubmed/33672741 http://dx.doi.org/10.3390/ijms22042120 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Auliah, Firda Nurul Nilamyani, Andi Nur Shoombuatong, Watshara Alam, Md Ashad Hasan, Md Mehedi Kurata, Hiroyuki PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title | PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title_full | PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title_fullStr | PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title_full_unstemmed | PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title_short | PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations |
title_sort | pup-fuse: prediction of protein pupylation sites by integrating multiple sequence representations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924619/ https://www.ncbi.nlm.nih.gov/pubmed/33672741 http://dx.doi.org/10.3390/ijms22042120 |
work_keys_str_mv | AT auliahfirdanurul pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations AT nilamyaniandinur pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations AT shoombuatongwatshara pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations AT alammdashad pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations AT hasanmdmehedi pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations AT kuratahiroyuki pupfusepredictionofproteinpupylationsitesbyintegratingmultiplesequencerepresentations |