Cargando…

Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble

BACKGROUND: It has become a very important and full of challenge task to predict bacterial protein subcellular locations using computational methods. Although there exist a lot of prediction methods for bacterial proteins, the majority of these methods can only deal with single-location proteins. Bu...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiao, Zhang, Jun, Li, Guo-Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705491/
https://www.ncbi.nlm.nih.gov/pubmed/26329681
http://dx.doi.org/10.1186/1471-2105-16-S12-S1
_version_ 1782409023238176768
author Wang, Xiao
Zhang, Jun
Li, Guo-Zheng
author_facet Wang, Xiao
Zhang, Jun
Li, Guo-Zheng
author_sort Wang, Xiao
collection PubMed
description BACKGROUND: It has become a very important and full of challenge task to predict bacterial protein subcellular locations using computational methods. Although there exist a lot of prediction methods for bacterial proteins, the majority of these methods can only deal with single-location proteins. But unfortunately many multi-location proteins are located in the bacterial cells. Moreover, multi-location proteins have special biological functions capable of helping the development of new drugs. So it is necessary to develop new computational methods for accurately predicting subcellular locations of multi-location bacterial proteins. RESULTS: In this article, two efficient multi-label predictors, Gpos-ECC-mPLoc and Gneg-ECC-mPLoc, are developed to predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. The two multi-label predictors construct the GO vectors by using the GO terms of homologous proteins of query proteins and then adopt a powerful multi-label ensemble classifier to make the final multi-label prediction. The two multi-label predictors have the following advantages: (1) they improve the prediction performance of multi-label proteins by taking the correlations among different labels into account; (2) they ensemble multiple CC classifiers and further generate better prediction results by ensemble learning; and (3) they construct the GO vectors by using the frequency of occurrences of GO terms in the typical homologous set instead of using 0/1 values. Experimental results show that Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. CONCLUSIONS: Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently improve prediction accuracy of subcellular localization of multi-location gram-positive and gram-negative bacterial proteins respectively. The online web servers for Gpos-ECC-mPLoc and Gneg-ECC-mPLoc predictors are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/gpos-ecc-mploc/ and http://biomed.zzuli.edu.cn/bioinfo/gneg-ecc-mploc/ respectively.
format Online
Article
Text
id pubmed-4705491
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47054912016-01-20 Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble Wang, Xiao Zhang, Jun Li, Guo-Zheng BMC Bioinformatics Research BACKGROUND: It has become a very important and full of challenge task to predict bacterial protein subcellular locations using computational methods. Although there exist a lot of prediction methods for bacterial proteins, the majority of these methods can only deal with single-location proteins. But unfortunately many multi-location proteins are located in the bacterial cells. Moreover, multi-location proteins have special biological functions capable of helping the development of new drugs. So it is necessary to develop new computational methods for accurately predicting subcellular locations of multi-location bacterial proteins. RESULTS: In this article, two efficient multi-label predictors, Gpos-ECC-mPLoc and Gneg-ECC-mPLoc, are developed to predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. The two multi-label predictors construct the GO vectors by using the GO terms of homologous proteins of query proteins and then adopt a powerful multi-label ensemble classifier to make the final multi-label prediction. The two multi-label predictors have the following advantages: (1) they improve the prediction performance of multi-label proteins by taking the correlations among different labels into account; (2) they ensemble multiple CC classifiers and further generate better prediction results by ensemble learning; and (3) they construct the GO vectors by using the frequency of occurrences of GO terms in the typical homologous set instead of using 0/1 values. Experimental results show that Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently predict the subcellular locations of multi-label gram-positive and gram-negative bacterial proteins respectively. CONCLUSIONS: Gpos-ECC-mPLoc and Gneg-ECC-mPLoc can efficiently improve prediction accuracy of subcellular localization of multi-location gram-positive and gram-negative bacterial proteins respectively. The online web servers for Gpos-ECC-mPLoc and Gneg-ECC-mPLoc predictors are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/gpos-ecc-mploc/ and http://biomed.zzuli.edu.cn/bioinfo/gneg-ecc-mploc/ respectively. BioMed Central 2015-08-25 /pmc/articles/PMC4705491/ /pubmed/26329681 http://dx.doi.org/10.1186/1471-2105-16-S12-S1 Text en Copyright © 2015 Wang et al.; http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wang, Xiao
Zhang, Jun
Li, Guo-Zheng
Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title_full Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title_fullStr Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title_full_unstemmed Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title_short Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
title_sort multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705491/
https://www.ncbi.nlm.nih.gov/pubmed/26329681
http://dx.doi.org/10.1186/1471-2105-16-S12-S1
work_keys_str_mv AT wangxiao multilocationgrampositiveandgramnegativebacterialproteinsubcellularlocalizationusinggeneontologyandmultilabelclassifierensemble
AT zhangjun multilocationgrampositiveandgramnegativebacterialproteinsubcellularlocalizationusinggeneontologyandmultilabelclassifierensemble
AT liguozheng multilocationgrampositiveandgramnegativebacterialproteinsubcellularlocalizationusinggeneontologyandmultilabelclassifierensemble