Cargando…
Machine learning-based approaches for ubiquitination site prediction in human proteins
Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditio...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683244/ https://www.ncbi.nlm.nih.gov/pubmed/38017391 http://dx.doi.org/10.1186/s12859-023-05581-w |
_version_ | 1785151151475261440 |
---|---|
author | Pourmirzaei, Mahdi Ramazi, Shahin Esmaili, Farzaneh Shojaeilangari, Seyedehsamaneh Allahvardi, Abdollah |
author_facet | Pourmirzaei, Mahdi Ramazi, Shahin Esmaili, Farzaneh Shojaeilangari, Seyedehsamaneh Allahvardi, Abdollah |
author_sort | Pourmirzaei, Mahdi |
collection | PubMed |
description | Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditional approaches for Ubi-site detection, there has been a growing interest in leveraging artificial intelligence for computer-aided Ubi-site prediction. In this study, we collected experimentally verified Ubi-sites of human proteins from the dbPTM database, then conducted comprehensive state-of-the art computational methods along with standard evaluation metrics and a proper validation strategy for Ubi-site prediction. We presented the effectiveness of our framework by comparing ten machine learning (ML) based approaches in three different categories: feature-based conventional ML methods, end-to-end sequence-based deep learning (DL) techniques, and hybrid feature-based DL models. Our results revealed that DL approaches outperformed the classical ML methods, achieving a 0.902 F1-score, 0.8198 accuracy, 0.8786 precision, and 0.9147 recall as the best performance for a DL model using both raw amino acid sequences and hand-crafted features. Interestingly, our experimental results disclosed that the performance of DL methods had a positive correlation with the length of amino acid fragments, suggesting that utilizing the entire sequence can lead to more accurate predictions in future research endeavors. Additionally, we developed a meticulously curated benchmark for Ubi-site prediction in human proteins. This benchmark serves as a valuable resource for future studies, enabling fair and accurate comparisons between different methods. Overall, our work highlights the potential of ML, particularly DL techniques, in predicting Ubi-sites and furthering our knowledge of protein regulation through ubiquitination in cells. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05581-w. |
format | Online Article Text |
id | pubmed-10683244 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-106832442023-11-30 Machine learning-based approaches for ubiquitination site prediction in human proteins Pourmirzaei, Mahdi Ramazi, Shahin Esmaili, Farzaneh Shojaeilangari, Seyedehsamaneh Allahvardi, Abdollah BMC Bioinformatics Research Protein ubiquitination is a critical post-translational modification (PTMs) involved in numerous cellular processes. Identifying ubiquitination sites (Ubi-sites) on proteins offers valuable insights into their function and regulatory mechanisms. Due to the cost- and time-consuming nature of traditional approaches for Ubi-site detection, there has been a growing interest in leveraging artificial intelligence for computer-aided Ubi-site prediction. In this study, we collected experimentally verified Ubi-sites of human proteins from the dbPTM database, then conducted comprehensive state-of-the art computational methods along with standard evaluation metrics and a proper validation strategy for Ubi-site prediction. We presented the effectiveness of our framework by comparing ten machine learning (ML) based approaches in three different categories: feature-based conventional ML methods, end-to-end sequence-based deep learning (DL) techniques, and hybrid feature-based DL models. Our results revealed that DL approaches outperformed the classical ML methods, achieving a 0.902 F1-score, 0.8198 accuracy, 0.8786 precision, and 0.9147 recall as the best performance for a DL model using both raw amino acid sequences and hand-crafted features. Interestingly, our experimental results disclosed that the performance of DL methods had a positive correlation with the length of amino acid fragments, suggesting that utilizing the entire sequence can lead to more accurate predictions in future research endeavors. Additionally, we developed a meticulously curated benchmark for Ubi-site prediction in human proteins. This benchmark serves as a valuable resource for future studies, enabling fair and accurate comparisons between different methods. Overall, our work highlights the potential of ML, particularly DL techniques, in predicting Ubi-sites and furthering our knowledge of protein regulation through ubiquitination in cells. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05581-w. BioMed Central 2023-11-28 /pmc/articles/PMC10683244/ /pubmed/38017391 http://dx.doi.org/10.1186/s12859-023-05581-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Pourmirzaei, Mahdi Ramazi, Shahin Esmaili, Farzaneh Shojaeilangari, Seyedehsamaneh Allahvardi, Abdollah Machine learning-based approaches for ubiquitination site prediction in human proteins |
title | Machine learning-based approaches for ubiquitination site prediction in human proteins |
title_full | Machine learning-based approaches for ubiquitination site prediction in human proteins |
title_fullStr | Machine learning-based approaches for ubiquitination site prediction in human proteins |
title_full_unstemmed | Machine learning-based approaches for ubiquitination site prediction in human proteins |
title_short | Machine learning-based approaches for ubiquitination site prediction in human proteins |
title_sort | machine learning-based approaches for ubiquitination site prediction in human proteins |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683244/ https://www.ncbi.nlm.nih.gov/pubmed/38017391 http://dx.doi.org/10.1186/s12859-023-05581-w |
work_keys_str_mv | AT pourmirzaeimahdi machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins AT ramazishahin machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins AT esmailifarzaneh machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins AT shojaeilangariseyedehsamaneh machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins AT allahvardiabdollah machinelearningbasedapproachesforubiquitinationsitepredictioninhumanproteins |