Cargando…

Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture

BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also i...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Fei, Wang, Rui, Li, Jiagen, Bao, Lingling, Xu, Dong, Zhao, Xiaowei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249717/
https://www.ncbi.nlm.nih.gov/pubmed/30463553
http://dx.doi.org/10.1186/s12918-018-0628-0
_version_ 1783372799468896256
author He, Fei
Wang, Rui
Li, Jiagen
Bao, Lingling
Xu, Dong
Zhao, Xiaowei
author_facet He, Fei
Wang, Rui
Li, Jiagen
Bao, Lingling
Xu, Dong
Zhao, Xiaowei
author_sort He, Fei
collection PubMed
description BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites. RESULTS: The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools. CONCLUSION: The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation.
format Online
Article
Text
id pubmed-6249717
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62497172018-11-26 Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture He, Fei Wang, Rui Li, Jiagen Bao, Lingling Xu, Dong Zhao, Xiaowei BMC Syst Biol Research BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites. RESULTS: The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools. CONCLUSION: The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation. BioMed Central 2018-11-22 /pmc/articles/PMC6249717/ /pubmed/30463553 http://dx.doi.org/10.1186/s12918-018-0628-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
He, Fei
Wang, Rui
Li, Jiagen
Bao, Lingling
Xu, Dong
Zhao, Xiaowei
Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title_full Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title_fullStr Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title_full_unstemmed Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title_short Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
title_sort large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249717/
https://www.ncbi.nlm.nih.gov/pubmed/30463553
http://dx.doi.org/10.1186/s12918-018-0628-0
work_keys_str_mv AT hefei largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture
AT wangrui largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture
AT lijiagen largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture
AT baolingling largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture
AT xudong largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture
AT zhaoxiaowei largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture