Cargando…

Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites

It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begu...

Descripción completa

Detalles Bibliográficos
Autores principales: He, Jianjun, Gu, Hong, Liu, Wenqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371015/
https://www.ncbi.nlm.nih.gov/pubmed/22715364
http://dx.doi.org/10.1371/journal.pone.0037155
_version_ 1782235165396828160
author He, Jianjun
Gu, Hong
Liu, Wenqi
author_facet He, Jianjun
Gu, Hong
Liu, Wenqi
author_sort He, Jianjun
collection PubMed
description It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using Gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches.
format Online
Article
Text
id pubmed-3371015
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33710152012-06-19 Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites He, Jianjun Gu, Hong Liu, Wenqi PLoS One Research Article It is well known that an important step toward understanding the functions of a protein is to determine its subcellular location. Although numerous prediction algorithms have been developed, most of them typically focused on the proteins with only one location. In recent years, researchers have begun to pay attention to the subcellular localization prediction of the proteins with multiple sites. However, almost all the existing approaches have failed to take into account the correlations among the locations caused by the proteins with multiple sites, which may be the important information for improving the prediction accuracy of the proteins with multiple sites. In this paper, a new algorithm which can effectively exploit the correlations among the locations is proposed by using Gaussian process model. Besides, the algorithm also can realize optimal linear combination of various feature extraction technologies and could be robust to the imbalanced data set. Experimental results on a human protein data set show that the proposed algorithm is valid and can achieve better performance than the existing approaches. Public Library of Science 2012-06-08 /pmc/articles/PMC3371015/ /pubmed/22715364 http://dx.doi.org/10.1371/journal.pone.0037155 Text en He et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
He, Jianjun
Gu, Hong
Liu, Wenqi
Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title_full Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title_fullStr Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title_full_unstemmed Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title_short Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites
title_sort imbalanced multi-modal multi-label learning for subcellular localization prediction of human proteins with both single and multiple sites
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3371015/
https://www.ncbi.nlm.nih.gov/pubmed/22715364
http://dx.doi.org/10.1371/journal.pone.0037155
work_keys_str_mv AT hejianjun imbalancedmultimodalmultilabellearningforsubcellularlocalizationpredictionofhumanproteinswithbothsingleandmultiplesites
AT guhong imbalancedmultimodalmultilabellearningforsubcellularlocalizationpredictionofhumanproteinswithbothsingleandmultiplesites
AT liuwenqi imbalancedmultimodalmultilabellearningforsubcellularlocalizationpredictionofhumanproteinswithbothsingleandmultiplesites