Cargando…

Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization

Recent years have witnessed much progress in computational modelling for protein subcellular localization. However, the existing sequence-based predictive models demonstrate moderate or unsatisfactory performance, and the gene ontology (GO) based models may take the risk of performance overestimatio...

Descripción completa

Detalles Bibliográficos
Autor principal: Mei, Suyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3374840/
https://www.ncbi.nlm.nih.gov/pubmed/22719847
http://dx.doi.org/10.1371/journal.pone.0037716
_version_ 1782235693339115520
author Mei, Suyu
author_facet Mei, Suyu
author_sort Mei, Suyu
collection PubMed
description Recent years have witnessed much progress in computational modelling for protein subcellular localization. However, the existing sequence-based predictive models demonstrate moderate or unsatisfactory performance, and the gene ontology (GO) based models may take the risk of performance overestimation for novel proteins. Furthermore, many human proteins have multiple subcellular locations, which renders the computational modelling more complicated. Up to the present, there are far few researches specialized for predicting the subcellular localization of human proteins that may reside in multiple cellular compartments. In this paper, we propose a multi-label multi-kernel transfer learning model for human protein subcellular localization (MLMK-TLM). MLMK-TLM proposes a multi-label confusion matrix, formally formulates three multi-labelling performance measures and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which to further extends our published work GO-TLM (gene ontology based transfer learning model for protein subcellular localization) and MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for multiplex human protein subcellular localization. With the advantages of proper homolog knowledge transfer, comprehensive survey of model performance for novel protein and multi-labelling capability, MLMK-TLM will gain more practical applicability. The experiments on human protein benchmark dataset show that MLMK-TLM significantly outperforms the baseline model and demonstrates good multi-labelling ability for novel human proteins. Some findings (predictions) are validated by the latest Swiss-Prot database. The software can be freely downloaded at http://soft.synu.edu.cn/upload/msy.rar.
format Online
Article
Text
id pubmed-3374840
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33748402012-06-20 Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization Mei, Suyu PLoS One Research Article Recent years have witnessed much progress in computational modelling for protein subcellular localization. However, the existing sequence-based predictive models demonstrate moderate or unsatisfactory performance, and the gene ontology (GO) based models may take the risk of performance overestimation for novel proteins. Furthermore, many human proteins have multiple subcellular locations, which renders the computational modelling more complicated. Up to the present, there are far few researches specialized for predicting the subcellular localization of human proteins that may reside in multiple cellular compartments. In this paper, we propose a multi-label multi-kernel transfer learning model for human protein subcellular localization (MLMK-TLM). MLMK-TLM proposes a multi-label confusion matrix, formally formulates three multi-labelling performance measures and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which to further extends our published work GO-TLM (gene ontology based transfer learning model for protein subcellular localization) and MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for multiplex human protein subcellular localization. With the advantages of proper homolog knowledge transfer, comprehensive survey of model performance for novel protein and multi-labelling capability, MLMK-TLM will gain more practical applicability. The experiments on human protein benchmark dataset show that MLMK-TLM significantly outperforms the baseline model and demonstrates good multi-labelling ability for novel human proteins. Some findings (predictions) are validated by the latest Swiss-Prot database. The software can be freely downloaded at http://soft.synu.edu.cn/upload/msy.rar. Public Library of Science 2012-06-13 /pmc/articles/PMC3374840/ /pubmed/22719847 http://dx.doi.org/10.1371/journal.pone.0037716 Text en Suyu Mei. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mei, Suyu
Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title_full Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title_fullStr Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title_full_unstemmed Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title_short Multi-Label Multi-Kernel Transfer Learning for Human Protein Subcellular Localization
title_sort multi-label multi-kernel transfer learning for human protein subcellular localization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3374840/
https://www.ncbi.nlm.nih.gov/pubmed/22719847
http://dx.doi.org/10.1371/journal.pone.0037716
work_keys_str_mv AT meisuyu multilabelmultikerneltransferlearningforhumanproteinsubcellularlocalization