Cargando…
A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction
BACKGROUND: Ubiquitylation is an important post-translational modification of proteins that not only plays a central role in cellular coding, but is also closely associated with the development of a variety of diseases. The specific selection of substrate by ligase E3 is the key in ubiquitylation. A...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8524957/ https://www.ncbi.nlm.nih.gov/pubmed/34663215 http://dx.doi.org/10.1186/s12859-021-04435-7 |
_version_ | 1784585579787190272 |
---|---|
author | Luo, Mengqi Li, Zhongyan Li, Shangfu Lee, Tzong-Yi |
author_facet | Luo, Mengqi Li, Zhongyan Li, Shangfu Lee, Tzong-Yi |
author_sort | Luo, Mengqi |
collection | PubMed |
description | BACKGROUND: Ubiquitylation is an important post-translational modification of proteins that not only plays a central role in cellular coding, but is also closely associated with the development of a variety of diseases. The specific selection of substrate by ligase E3 is the key in ubiquitylation. As various high-throughput analytical techniques continue to be applied to the study of ubiquitylation, a large amount of ubiquitylation site data, and records of E3-substrate interactions continue to be generated. Biomedical literature is an important vehicle for information on E3-substrate interactions in ubiquitylation and related new discoveries, as well as an important channel for researchers to obtain such up to date data. The continuous explosion of ubiquitylation related literature poses a great challenge to researchers in acquiring and analyzing the information. Therefore, automatic annotation of these E3-substrate interaction sentences from the available literature is urgently needed. RESULTS: In this research, we proposed a model based on representation and attention mechanism based deep learning methods, to automatic annotate E3-substrate interaction sentences in biomedical literature. Focusing on the sentences with E3 protein inside, we applied several natural language processing methods and a Long Short-Term Memory (LSTM)-based deep learning classifier to train the model. Experimental results had proved the effectiveness of our proposed model. And also, the proposed attention mechanism deep learning method outperforms other statistical machine learning methods. We also created a manual corpus of E3-substrate interaction sentences, in which the E3 proteins and substrate proteins are also labeled, in order to construct our model. The corpus and model proposed by our research are definitely able to be very useful and valuable resource for advancement of ubiquitylation-related research. CONCLUSION: Having the entire manual corpus of E3-substrate interaction sentences readily available in electronic form will greatly facilitate subsequent text mining and machine learning analyses. Automatic annotating ubiquitylation sentences stating E3 ligase-substrate interaction is significantly benefited from semantic representation and deep learning. The model enables rapid information accessing and can assist in further screening of key ubiquitylation ligase substrates for in-depth studies. |
format | Online Article Text |
id | pubmed-8524957 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-85249572021-10-22 A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction Luo, Mengqi Li, Zhongyan Li, Shangfu Lee, Tzong-Yi BMC Bioinformatics Research BACKGROUND: Ubiquitylation is an important post-translational modification of proteins that not only plays a central role in cellular coding, but is also closely associated with the development of a variety of diseases. The specific selection of substrate by ligase E3 is the key in ubiquitylation. As various high-throughput analytical techniques continue to be applied to the study of ubiquitylation, a large amount of ubiquitylation site data, and records of E3-substrate interactions continue to be generated. Biomedical literature is an important vehicle for information on E3-substrate interactions in ubiquitylation and related new discoveries, as well as an important channel for researchers to obtain such up to date data. The continuous explosion of ubiquitylation related literature poses a great challenge to researchers in acquiring and analyzing the information. Therefore, automatic annotation of these E3-substrate interaction sentences from the available literature is urgently needed. RESULTS: In this research, we proposed a model based on representation and attention mechanism based deep learning methods, to automatic annotate E3-substrate interaction sentences in biomedical literature. Focusing on the sentences with E3 protein inside, we applied several natural language processing methods and a Long Short-Term Memory (LSTM)-based deep learning classifier to train the model. Experimental results had proved the effectiveness of our proposed model. And also, the proposed attention mechanism deep learning method outperforms other statistical machine learning methods. We also created a manual corpus of E3-substrate interaction sentences, in which the E3 proteins and substrate proteins are also labeled, in order to construct our model. The corpus and model proposed by our research are definitely able to be very useful and valuable resource for advancement of ubiquitylation-related research. CONCLUSION: Having the entire manual corpus of E3-substrate interaction sentences readily available in electronic form will greatly facilitate subsequent text mining and machine learning analyses. Automatic annotating ubiquitylation sentences stating E3 ligase-substrate interaction is significantly benefited from semantic representation and deep learning. The model enables rapid information accessing and can assist in further screening of key ubiquitylation ligase substrates for in-depth studies. BioMed Central 2021-10-18 /pmc/articles/PMC8524957/ /pubmed/34663215 http://dx.doi.org/10.1186/s12859-021-04435-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Luo, Mengqi Li, Zhongyan Li, Shangfu Lee, Tzong-Yi A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title | A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title_full | A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title_fullStr | A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title_full_unstemmed | A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title_short | A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction |
title_sort | representation and deep learning model for annotating ubiquitylation sentences stating e3 ligase - substrate interaction |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8524957/ https://www.ncbi.nlm.nih.gov/pubmed/34663215 http://dx.doi.org/10.1186/s12859-021-04435-7 |
work_keys_str_mv | AT luomengqi arepresentationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT lizhongyan arepresentationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT lishangfu arepresentationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT leetzongyi arepresentationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT luomengqi representationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT lizhongyan representationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT lishangfu representationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction AT leetzongyi representationanddeeplearningmodelforannotatingubiquitylationsentencesstatinge3ligasesubstrateinteraction |