Cargando…

FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures

BACKGROUND: The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases, where a set of genes usually work in concert to cause diseases to crops. To function appropriately, the F. graminearum proteins inside one cell should be assigne...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Chenglei, Zhao, Xing-Ming, Tang, Weihua, Chen, Luonan
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982686/
https://www.ncbi.nlm.nih.gov/pubmed/20840726
http://dx.doi.org/10.1186/1752-0509-4-S2-S12
_version_ 1782191756750618624
author Sun, Chenglei
Zhao, Xing-Ming
Tang, Weihua
Chen, Luonan
author_facet Sun, Chenglei
Zhao, Xing-Ming
Tang, Weihua
Chen, Luonan
author_sort Sun, Chenglei
collection PubMed
description BACKGROUND: The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases, where a set of genes usually work in concert to cause diseases to crops. To function appropriately, the F. graminearum proteins inside one cell should be assigned to different compartments, i.e. subcellular localizations. Therefore, the subcellular localizations of F. graminearum proteins can provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus. Unfortunately, there are no subcellular localization information for F. graminearum proteins available now. Computational approaches provide an alternative way to predicting F. graminearum protein subcellular localizations due to the expensive and time-consuming biological experiments in lab. RESULTS: In this paper, we developed a novel predictor, namely FGsub, to predict F. graminearum protein subcellular localizations from the primary structures. First, a non-redundant fungi data set with subcellular localization annotation is collected from UniProtKB database and used as training set, where the subcellular locations are classified into 10 groups. Subsequently, Support Vector Machine (SVM) is trained on the training set and used to predict F. graminearum protein subcellular localizations for those proteins that do not have significant sequence similarity to those in training set. The performance of SVMs on training set with 10-fold cross-validation demonstrates the efficiency and effectiveness of the proposed method. In addition, for F. graminearum proteins that have significant sequence similarity to those in training set, BLAST is utilized to transfer annotations of homologous proteins to uncharacterized F. graminearum proteins so that the F. graminearum proteins are annotated more comprehensively. CONCLUSIONS: In this work, we present FGsub to predict F. graminearum protein subcellular localizations in a comprehensive manner. We make four fold contributions to this filed. First, we present a new algorithm to cope with imbalance problem that arises in protein subcellular localization prediction, which can solve imbalance problem and avoid false positive results. Second, we design an ensemble classifier which employs feature selection to further improve prediction accuracy. Third, we use BLAST to complement machine learning based methods, which enlarges our prediction coverage. Last and most important, we predict the subcellular localizations of 12786 F. graminearum proteins, which provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus.
format Text
id pubmed-2982686
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29826862010-11-17 FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures Sun, Chenglei Zhao, Xing-Ming Tang, Weihua Chen, Luonan BMC Syst Biol Proceedings BACKGROUND: The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases, where a set of genes usually work in concert to cause diseases to crops. To function appropriately, the F. graminearum proteins inside one cell should be assigned to different compartments, i.e. subcellular localizations. Therefore, the subcellular localizations of F. graminearum proteins can provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus. Unfortunately, there are no subcellular localization information for F. graminearum proteins available now. Computational approaches provide an alternative way to predicting F. graminearum protein subcellular localizations due to the expensive and time-consuming biological experiments in lab. RESULTS: In this paper, we developed a novel predictor, namely FGsub, to predict F. graminearum protein subcellular localizations from the primary structures. First, a non-redundant fungi data set with subcellular localization annotation is collected from UniProtKB database and used as training set, where the subcellular locations are classified into 10 groups. Subsequently, Support Vector Machine (SVM) is trained on the training set and used to predict F. graminearum protein subcellular localizations for those proteins that do not have significant sequence similarity to those in training set. The performance of SVMs on training set with 10-fold cross-validation demonstrates the efficiency and effectiveness of the proposed method. In addition, for F. graminearum proteins that have significant sequence similarity to those in training set, BLAST is utilized to transfer annotations of homologous proteins to uncharacterized F. graminearum proteins so that the F. graminearum proteins are annotated more comprehensively. CONCLUSIONS: In this work, we present FGsub to predict F. graminearum protein subcellular localizations in a comprehensive manner. We make four fold contributions to this filed. First, we present a new algorithm to cope with imbalance problem that arises in protein subcellular localization prediction, which can solve imbalance problem and avoid false positive results. Second, we design an ensemble classifier which employs feature selection to further improve prediction accuracy. Third, we use BLAST to complement machine learning based methods, which enlarges our prediction coverage. Last and most important, we predict the subcellular localizations of 12786 F. graminearum proteins, which provide insights into protein functions and pathogenic mechanisms of this destructive pathogen fungus. BioMed Central 2010-09-13 /pmc/articles/PMC2982686/ /pubmed/20840726 http://dx.doi.org/10.1186/1752-0509-4-S2-S12 Text en Copyright ©2010 Zhao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Sun, Chenglei
Zhao, Xing-Ming
Tang, Weihua
Chen, Luonan
FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title_full FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title_fullStr FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title_full_unstemmed FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title_short FGsub: Fusarium graminearum protein subcellular localizations predicted from primary structures
title_sort fgsub: fusarium graminearum protein subcellular localizations predicted from primary structures
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2982686/
https://www.ncbi.nlm.nih.gov/pubmed/20840726
http://dx.doi.org/10.1186/1752-0509-4-S2-S12
work_keys_str_mv AT sunchenglei fgsubfusariumgraminearumproteinsubcellularlocalizationspredictedfromprimarystructures
AT zhaoxingming fgsubfusariumgraminearumproteinsubcellularlocalizationspredictedfromprimarystructures
AT tangweihua fgsubfusariumgraminearumproteinsubcellularlocalizationspredictedfromprimarystructures
AT chenluonan fgsubfusariumgraminearumproteinsubcellularlocalizationspredictedfromprimarystructures