Cargando…

Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites

MOTIVATION: The rapid and extensive transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to an unprecedented global health emergency, affecting millions of people and causing an immense socioeconomic impact. The identification of SARS-CoV-2 phosphorylation sites p...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiao, Shihu, Ye, Xiucai, Ao, Chunyan, Sakurai, Tetsuya, Zou, Quan, Xu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628388/
https://www.ncbi.nlm.nih.gov/pubmed/37847658
http://dx.doi.org/10.1093/bioinformatics/btad627
_version_ 1785131745849376768
author Jiao, Shihu
Ye, Xiucai
Ao, Chunyan
Sakurai, Tetsuya
Zou, Quan
Xu, Lei
author_facet Jiao, Shihu
Ye, Xiucai
Ao, Chunyan
Sakurai, Tetsuya
Zou, Quan
Xu, Lei
author_sort Jiao, Shihu
collection PubMed
description MOTIVATION: The rapid and extensive transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to an unprecedented global health emergency, affecting millions of people and causing an immense socioeconomic impact. The identification of SARS-CoV-2 phosphorylation sites plays an important role in unraveling the complex molecular mechanisms behind infection and the resulting alterations in host cell pathways. However, currently available prediction tools for identifying these sites lack accuracy and efficiency. RESULTS: In this study, we presented a comprehensive biological function analysis of SARS-CoV-2 infection in a clonal human lung epithelial A549 cell, revealing dramatic changes in protein phosphorylation pathways in host cells. Moreover, a novel deep learning predictor called PSPred-ALE is specifically designed to identify phosphorylation sites in human host cells that are infected with SARS-CoV-2. The key idea of PSPred-ALE lies in the use of a self-adaptive learning embedding algorithm, which enables the automatic extraction of context sequential features from protein sequences. In addition, the tool uses multihead attention module that enables the capturing of global information, further improving the accuracy of predictions. Comparative analysis of features demonstrated that the self-adaptive learning embedding features are superior to hand-crafted statistical features in capturing discriminative sequence information. Benchmarking comparison shows that PSPred-ALE outperforms the state-of-the-art prediction tools and achieves robust performance. Therefore, the proposed model can effectively identify phosphorylation sites assistant the biomedical scientists in understanding the mechanism of phosphorylation in SARS-CoV-2 infection. AVAILABILITY AND IMPLEMENTATION: PSPred-ALE is available at https://github.com/jiaoshihu/PSPred-ALE and Zenodo (https://doi.org/10.5281/zenodo.8330277).
format Online
Article
Text
id pubmed-10628388
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106283882023-11-08 Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites Jiao, Shihu Ye, Xiucai Ao, Chunyan Sakurai, Tetsuya Zou, Quan Xu, Lei Bioinformatics Original Paper MOTIVATION: The rapid and extensive transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has led to an unprecedented global health emergency, affecting millions of people and causing an immense socioeconomic impact. The identification of SARS-CoV-2 phosphorylation sites plays an important role in unraveling the complex molecular mechanisms behind infection and the resulting alterations in host cell pathways. However, currently available prediction tools for identifying these sites lack accuracy and efficiency. RESULTS: In this study, we presented a comprehensive biological function analysis of SARS-CoV-2 infection in a clonal human lung epithelial A549 cell, revealing dramatic changes in protein phosphorylation pathways in host cells. Moreover, a novel deep learning predictor called PSPred-ALE is specifically designed to identify phosphorylation sites in human host cells that are infected with SARS-CoV-2. The key idea of PSPred-ALE lies in the use of a self-adaptive learning embedding algorithm, which enables the automatic extraction of context sequential features from protein sequences. In addition, the tool uses multihead attention module that enables the capturing of global information, further improving the accuracy of predictions. Comparative analysis of features demonstrated that the self-adaptive learning embedding features are superior to hand-crafted statistical features in capturing discriminative sequence information. Benchmarking comparison shows that PSPred-ALE outperforms the state-of-the-art prediction tools and achieves robust performance. Therefore, the proposed model can effectively identify phosphorylation sites assistant the biomedical scientists in understanding the mechanism of phosphorylation in SARS-CoV-2 infection. AVAILABILITY AND IMPLEMENTATION: PSPred-ALE is available at https://github.com/jiaoshihu/PSPred-ALE and Zenodo (https://doi.org/10.5281/zenodo.8330277). Oxford University Press 2023-10-17 /pmc/articles/PMC10628388/ /pubmed/37847658 http://dx.doi.org/10.1093/bioinformatics/btad627 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Jiao, Shihu
Ye, Xiucai
Ao, Chunyan
Sakurai, Tetsuya
Zou, Quan
Xu, Lei
Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title_full Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title_fullStr Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title_full_unstemmed Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title_short Adaptive learning embedding features to improve the predictive performance of SARS-CoV-2 phosphorylation sites
title_sort adaptive learning embedding features to improve the predictive performance of sars-cov-2 phosphorylation sites
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628388/
https://www.ncbi.nlm.nih.gov/pubmed/37847658
http://dx.doi.org/10.1093/bioinformatics/btad627
work_keys_str_mv AT jiaoshihu adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites
AT yexiucai adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites
AT aochunyan adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites
AT sakuraitetsuya adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites
AT zouquan adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites
AT xulei adaptivelearningembeddingfeaturestoimprovethepredictiveperformanceofsarscov2phosphorylationsites