Cargando…

Detailed prediction of protein sub-nuclear localization

BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predicti...

Descripción completa

Detalles Bibliográficos
Autores principales: Littmann, Maria, Goldberg, Tatyana, Seitz, Sebastian, Bodén, Mikael, Rost, Burkhard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480651/
https://www.ncbi.nlm.nih.gov/pubmed/31014229
http://dx.doi.org/10.1186/s12859-019-2790-9
_version_ 1783413614481244160
author Littmann, Maria
Goldberg, Tatyana
Seitz, Sebastian
Bodén, Mikael
Rost, Burkhard
author_facet Littmann, Maria
Goldberg, Tatyana
Seitz, Sebastian
Bodén, Mikael
Rost, Burkhard
author_sort Littmann, Maria
collection PubMed
description BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predictions of cellular compartments have become very accurate, largely at the expense of leaving out substructures inside the nucleus making a fine-grained analysis impossible. RESULTS: Here, we present a new method (LocNuclei) that predicts nuclear substructures from sequence alone. LocNuclei used a string-based Profile Kernel with Support Vector Machines (SVMs). It distinguishes sub-nuclear localization in 13 distinct substructures and distinguishes between nuclear proteins confined to the nucleus and those that are also native to other compartments (traveler proteins). High performance was achieved by implicitly leveraging a large biological knowledge-base in creating predictions by homology-based inference through BLAST. Using this approach, the performance reached AUC = 0.70–0.74 and Q13 = 59–65%. Travelling proteins (nucleus and other) were identified at Q2 = 70–74%. A Gene Ontology (GO) analysis of the enrichment of biological processes revealed that the predicted sub-nuclear compartments matched the expected functionality. Analysis of protein-protein interactions (PPI) show that formation of compartments and functionality of proteins in these compartments highly rely on interactions between proteins. This suggested that the LocNuclei predictions carry important information about function. The source code and data sets are available through GitHub: https://github.com/Rostlab/LocNuclei. CONCLUSIONS: LocNuclei predicts subnuclear compartments and traveler proteins accurately. These predictions carry important information about functionality and PPIs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2790-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6480651
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64806512019-05-01 Detailed prediction of protein sub-nuclear localization Littmann, Maria Goldberg, Tatyana Seitz, Sebastian Bodén, Mikael Rost, Burkhard BMC Bioinformatics Methodology Article BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predictions of cellular compartments have become very accurate, largely at the expense of leaving out substructures inside the nucleus making a fine-grained analysis impossible. RESULTS: Here, we present a new method (LocNuclei) that predicts nuclear substructures from sequence alone. LocNuclei used a string-based Profile Kernel with Support Vector Machines (SVMs). It distinguishes sub-nuclear localization in 13 distinct substructures and distinguishes between nuclear proteins confined to the nucleus and those that are also native to other compartments (traveler proteins). High performance was achieved by implicitly leveraging a large biological knowledge-base in creating predictions by homology-based inference through BLAST. Using this approach, the performance reached AUC = 0.70–0.74 and Q13 = 59–65%. Travelling proteins (nucleus and other) were identified at Q2 = 70–74%. A Gene Ontology (GO) analysis of the enrichment of biological processes revealed that the predicted sub-nuclear compartments matched the expected functionality. Analysis of protein-protein interactions (PPI) show that formation of compartments and functionality of proteins in these compartments highly rely on interactions between proteins. This suggested that the LocNuclei predictions carry important information about function. The source code and data sets are available through GitHub: https://github.com/Rostlab/LocNuclei. CONCLUSIONS: LocNuclei predicts subnuclear compartments and traveler proteins accurately. These predictions carry important information about functionality and PPIs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2790-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-23 /pmc/articles/PMC6480651/ /pubmed/31014229 http://dx.doi.org/10.1186/s12859-019-2790-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Littmann, Maria
Goldberg, Tatyana
Seitz, Sebastian
Bodén, Mikael
Rost, Burkhard
Detailed prediction of protein sub-nuclear localization
title Detailed prediction of protein sub-nuclear localization
title_full Detailed prediction of protein sub-nuclear localization
title_fullStr Detailed prediction of protein sub-nuclear localization
title_full_unstemmed Detailed prediction of protein sub-nuclear localization
title_short Detailed prediction of protein sub-nuclear localization
title_sort detailed prediction of protein sub-nuclear localization
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6480651/
https://www.ncbi.nlm.nih.gov/pubmed/31014229
http://dx.doi.org/10.1186/s12859-019-2790-9
work_keys_str_mv AT littmannmaria detailedpredictionofproteinsubnuclearlocalization
AT goldbergtatyana detailedpredictionofproteinsubnuclearlocalization
AT seitzsebastian detailedpredictionofproteinsubnuclearlocalization
AT bodenmikael detailedpredictionofproteinsubnuclearlocalization
AT rostburkhard detailedpredictionofproteinsubnuclearlocalization