Cargando…

BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens

As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biologi...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yinbo, Liu, Yufeng, Wang, Gang-Ao, Cheng, Yinchu, Bi, Shoudong, Zhu, Xiaolei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580886/
https://www.ncbi.nlm.nih.gov/pubmed/36304324
http://dx.doi.org/10.3389/fbinf.2022.834153
_version_ 1784812493289291776
author Liu, Yinbo
Liu, Yufeng
Wang, Gang-Ao
Cheng, Yinchu
Bi, Shoudong
Zhu, Xiaolei
author_facet Liu, Yinbo
Liu, Yufeng
Wang, Gang-Ao
Cheng, Yinchu
Bi, Shoudong
Zhu, Xiaolei
author_sort Liu, Yinbo
collection PubMed
description As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biological function and potential mechanism of glycation in the treatment of diseases. However, experimental methods are expensive and time-consuming for lysine glycation site identification. Instead, computational methods, with their higher efficiency and lower cost, could be an important supplement to the experimental methods. In this study, we proposed a novel predictor, BERT-Kgly, for protein lysine glycation site prediction, which was developed by extracting embedding features of protein segments from pretrained Bidirectional Encoder Representations from Transformers (BERT) models. Three pretrained BERT models were explored to get the embeddings with optimal representability, and three downstream deep networks were employed to build our models. Our results showed that the model based on embeddings extracted from the BERT model pretrained on 556,603 protein sequences of UniProt outperforms other models. In addition, an independent test set was used to evaluate and compare our model with other existing methods, which indicated that our model was superior to other existing models.
format Online
Article
Text
id pubmed-9580886
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95808862022-10-26 BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens Liu, Yinbo Liu, Yufeng Wang, Gang-Ao Cheng, Yinchu Bi, Shoudong Zhu, Xiaolei Front Bioinform Bioinformatics As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biological function and potential mechanism of glycation in the treatment of diseases. However, experimental methods are expensive and time-consuming for lysine glycation site identification. Instead, computational methods, with their higher efficiency and lower cost, could be an important supplement to the experimental methods. In this study, we proposed a novel predictor, BERT-Kgly, for protein lysine glycation site prediction, which was developed by extracting embedding features of protein segments from pretrained Bidirectional Encoder Representations from Transformers (BERT) models. Three pretrained BERT models were explored to get the embeddings with optimal representability, and three downstream deep networks were employed to build our models. Our results showed that the model based on embeddings extracted from the BERT model pretrained on 556,603 protein sequences of UniProt outperforms other models. In addition, an independent test set was used to evaluate and compare our model with other existing methods, which indicated that our model was superior to other existing models. Frontiers Media S.A. 2022-02-18 /pmc/articles/PMC9580886/ /pubmed/36304324 http://dx.doi.org/10.3389/fbinf.2022.834153 Text en Copyright © 2022 Liu, Liu, Wang, Cheng, Bi and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Liu, Yinbo
Liu, Yufeng
Wang, Gang-Ao
Cheng, Yinchu
Bi, Shoudong
Zhu, Xiaolei
BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title_full BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title_fullStr BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title_full_unstemmed BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title_short BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
title_sort bert-kgly: a bidirectional encoder representations from transformers (bert)-based model for predicting lysine glycation site for homo sapiens
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580886/
https://www.ncbi.nlm.nih.gov/pubmed/36304324
http://dx.doi.org/10.3389/fbinf.2022.834153
work_keys_str_mv AT liuyinbo bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens
AT liuyufeng bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens
AT wanggangao bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens
AT chengyinchu bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens
AT bishoudong bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens
AT zhuxiaolei bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens