Cargando…

Machine learning-based approaches for ubiquitination site prediction in human proteins

Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditio...

Descripción completa

Detalles Bibliográficos
Autores principales: Pourmirzaei, Mahdi, Ramazi, Shahin, Esmaili, Farzaneh, Shojaeilangari, Seyedehsamaneh, Allahvardi, Abdollah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683244/
https://www.ncbi.nlm.nih.gov/pubmed/38017391
http://dx.doi.org/10.1186/s12859-023-05581-w
_version_ 1785151151475261440
author Pourmirzaei, Mahdi
Ramazi, Shahin
Esmaili, Farzaneh
Shojaeilangari, Seyedehsamaneh
Allahvardi, Abdollah
author_facet Pourmirzaei, Mahdi
Ramazi, Shahin
Esmaili, Farzaneh
Shojaeilangari, Seyedehsamaneh
Allahvardi, Abdollah
author_sort Pourmirzaei, Mahdi
collection PubMed
description Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditional approaches for Ubi-site detection, there has been a growing interest in leveraging artificial intelligence for computer-aided Ubi-site prediction. In this study, we collected experimentally verified Ubi-sites of human proteins from the dbPTM database, then conducted comprehensive state-of-the art computational methods along with standard evaluation metrics and a proper validation strategy for Ubi-site prediction. We presented the effectiveness of our framework by comparing ten machine learning (ML) based approaches in three different categories: feature-based conventional ML methods, end-to-end sequence-based deep learning (DL) techniques, and hybrid feature-based DL models. Our results revealed that DL approaches outperformed the classical ML methods, achieving a 0.902 F1-score, 0.8198 accuracy, 0.8786 precision, and 0.9147 recall as the best performance for a DL model using both raw amino acid sequences and hand-crafted features. Interestingly, our experimental results disclosed that the performance of DL methods had a positive correlation with the length of amino acid fragments, suggesting that utilizing the entire sequence can lead to more accurate predictions in future research endeavors. Additionally, we developed a meticulously curated benchmark for Ubi-site prediction in human proteins. This benchmark serves as a valuable resource for future studies, enabling fair and accurate comparisons between different methods. Overall, our work highlights the potential of ML, particularly DL techniques, in predicting Ubi-sites and furthering our knowledge of protein regulation through ubiquitination in cells. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05581-w.
format Online
Article
Text
id pubmed-10683244
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-106832442023-11-30 Machine learning-based approaches for ubiquitination site prediction in human proteins Pourmirzaei, Mahdi Ramazi, Shahin Esmaili, Farzaneh Shojaeilangari, Seyedehsamaneh Allahvardi, Abdollah BMC Bioinformatics Research Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditional approaches for Ubi-site detection, there has been a growing interest in leveraging artificial intelligence for computer-aided Ubi-site prediction. In this study, we collected experimentally verified Ubi-sites of human proteins from the dbPTM database, then conducted comprehensive state-of-the art computational methods along with standard evaluation metrics and a proper validation strategy for Ubi-site prediction. We presented the effectiveness of our framework by comparing ten machine learning (ML) based approaches in three different categories: feature-based conventional ML methods, end-to-end sequence-based deep learning (DL) techniques, and hybrid feature-based DL models. Our results revealed that DL approaches outperformed the classical ML methods, achieving a 0.902 F1-score, 0.8198 accuracy, 0.8786 precision, and 0.9147 recall as the best performance for a DL model using both raw amino acid sequences and hand-crafted features. Interestingly, our experimental results disclosed that the performance of DL methods had a positive correlation with the length of amino acid fragments, suggesting that utilizing the entire sequence can lead to more accurate predictions in future research endeavors. Additionally, we developed a meticulously curated benchmark for Ubi-site prediction in human proteins. This benchmark serves as a valuable resource for future studies, enabling fair and accurate comparisons between different methods. Overall, our work highlights the potential of ML, particularly DL techniques, in predicting Ubi-sites and furthering our knowledge of protein regulation through ubiquitination in cells. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05581-w. BioMed Central 2023-11-28 /pmc/articles/PMC10683244/ /pubmed/38017391 http://dx.doi.org/10.1186/s12859-023-05581-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Pourmirzaei, Mahdi
Ramazi, Shahin
Esmaili, Farzaneh
Shojaeilangari, Seyedehsamaneh
Allahvardi, Abdollah
Machine learning-based approaches for ubiquitination site prediction in human proteins
title Machine learning-based approaches for ubiquitination site prediction in human proteins
title_full Machine learning-based approaches for ubiquitination site prediction in human proteins
title_fullStr Machine learning-based approaches for ubiquitination site prediction in human proteins
title_full_unstemmed Machine learning-based approaches for ubiquitination site prediction in human proteins
title_short Machine learning-based approaches for ubiquitination site prediction in human proteins
title_sort machine learning-based approaches for ubiquitination site prediction in human proteins
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683244/
https://www.ncbi.nlm.nih.gov/pubmed/38017391
http://dx.doi.org/10.1186/s12859-023-05581-w
work_keys_str_mv AT pourmirzaeimahdi machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins
AT ramazishahin machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins
AT esmailifarzaneh machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins
AT shojaeilangariseyedehsamaneh machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins
AT allahvardiabdollah machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins