Cargando…
Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture
BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also i...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249717/ https://www.ncbi.nlm.nih.gov/pubmed/30463553 http://dx.doi.org/10.1186/s12918-018-0628-0 |
_version_ | 1783372799468896256 |
---|---|
author | He, Fei Wang, Rui Li, Jiagen Bao, Lingling Xu, Dong Zhao, Xiaowei |
author_facet | He, Fei Wang, Rui Li, Jiagen Bao, Lingling Xu, Dong Zhao, Xiaowei |
author_sort | He, Fei |
collection | PubMed |
description | BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites. RESULTS: The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools. CONCLUSION: The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation. |
format | Online Article Text |
id | pubmed-6249717 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62497172018-11-26 Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture He, Fei Wang, Rui Li, Jiagen Bao, Lingling Xu, Dong Zhao, Xiaowei BMC Syst Biol Research BACKGROUND: Ubiquitination, which is also called “lysine ubiquitination”, occurs when an ubiquitin is attached to lysine (K) residues in targeting proteins. As one of the most important post translational modifications (PTMs), it plays the significant role not only in protein degradation, but also in other cellular functions. Thus, systematic anatomy of the ubiquitination proteome is an appealing and challenging research topic. The existing methods for identifying protein ubiquitination sites can be divided into two kinds: mass spectrometry and computational methods. Mass spectrometry-based experimental methods can discover ubiquitination sites from eukaryotes, but are time-consuming and expensive. Therefore, it is priority to develop computational approaches that can effectively and accurately identify protein ubiquitination sites. RESULTS: The existing computational methods usually require feature engineering, which may lead to redundancy and biased representations. While deep learning is able to excavate underlying characteristics from large-scale training data via multiple-layer networks and non-linear mapping operations. In this paper, we proposed a deep architecture within multiple modalities to identify the ubiquitination sites. First, according to prior knowledge and biological knowledge, we encoded protein sequence fragments around candidate ubiquitination sites into three modalities, namely raw protein sequence fragments, physico-chemical properties and sequence profiles, and designed different deep network layers to extract the hidden representations from them. Then, the generative deep representations corresponding to three modalities were merged to build the final model. We performed our algorithm on the available largest scale protein ubiquitination sites database PLMD, and achieved 66.4% specificity, 66.7% sensitivity, 66.43% accuracy, and 0.221 MCC value. A number of comparative experiments also indicated that our multimodal deep architecture outperformed several popular protein ubiquitination site prediction tools. CONCLUSION: The results of comparative experiments validated the effectiveness of our deep network and also displayed that our method outperformed several popular protein ubiquitination site prediction tools. The source codes of our proposed method are available at https://github.com/jiagenlee/deepUbiquitylation. BioMed Central 2018-11-22 /pmc/articles/PMC6249717/ /pubmed/30463553 http://dx.doi.org/10.1186/s12918-018-0628-0 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research He, Fei Wang, Rui Li, Jiagen Bao, Lingling Xu, Dong Zhao, Xiaowei Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title | Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title_full | Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title_fullStr | Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title_full_unstemmed | Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title_short | Large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
title_sort | large-scale prediction of protein ubiquitination sites using a multimodal deep architecture |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6249717/ https://www.ncbi.nlm.nih.gov/pubmed/30463553 http://dx.doi.org/10.1186/s12918-018-0628-0 |
work_keys_str_mv | AT hefei largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture AT wangrui largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture AT lijiagen largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture AT baolingling largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture AT xudong largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture AT zhaoxiaowei largescalepredictionofproteinubiquitinationsitesusingamultimodaldeeparchitecture |