Cargando…

Identification of Protein Subcellular Localization With Network and Functional Embeddings

The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein represent...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Xiaoyong, Li, Hao, Zeng, Tao, Li, Zhandong, Chen, Lei, Huang, Tao, Cai, Yu-Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7873866/
https://www.ncbi.nlm.nih.gov/pubmed/33584818
http://dx.doi.org/10.3389/fgene.2020.626500
_version_ 1783649464205967360
author Pan, Xiaoyong
Li, Hao
Zeng, Tao
Li, Zhandong
Chen, Lei
Huang, Tao
Cai, Yu-Dong
author_facet Pan, Xiaoyong
Li, Hao
Zeng, Tao
Li, Zhandong
Chen, Lei
Huang, Tao
Cai, Yu-Dong
author_sort Pan, Xiaoyong
collection PubMed
description The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein representations. In this study, we present an embedding-based method for predicting the subcellular localization of proteins. We first learn the functional embeddings of KEGG/GO terms, which are further used in representing proteins. Then, we characterize the network embeddings of proteins on a protein–protein network. The functional and network embeddings are combined as novel representations of protein locations for the construction of the final classification model. In our collected benchmark dataset with 4,861 proteins from 16 locations, the best model shows a Matthews correlation coefficient of 0.872 and is thus superior to multiple conventional methods.
format Online
Article
Text
id pubmed-7873866
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-78738662021-02-11 Identification of Protein Subcellular Localization With Network and Functional Embeddings Pan, Xiaoyong Li, Hao Zeng, Tao Li, Zhandong Chen, Lei Huang, Tao Cai, Yu-Dong Front Genet Genetics The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein representations. In this study, we present an embedding-based method for predicting the subcellular localization of proteins. We first learn the functional embeddings of KEGG/GO terms, which are further used in representing proteins. Then, we characterize the network embeddings of proteins on a protein–protein network. The functional and network embeddings are combined as novel representations of protein locations for the construction of the final classification model. In our collected benchmark dataset with 4,861 proteins from 16 locations, the best model shows a Matthews correlation coefficient of 0.872 and is thus superior to multiple conventional methods. Frontiers Media S.A. 2021-01-20 /pmc/articles/PMC7873866/ /pubmed/33584818 http://dx.doi.org/10.3389/fgene.2020.626500 Text en Copyright © 2021 Pan, Li, Zeng, Li, Chen, Huang and Cai. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Pan, Xiaoyong
Li, Hao
Zeng, Tao
Li, Zhandong
Chen, Lei
Huang, Tao
Cai, Yu-Dong
Identification of Protein Subcellular Localization With Network and Functional Embeddings
title Identification of Protein Subcellular Localization With Network and Functional Embeddings
title_full Identification of Protein Subcellular Localization With Network and Functional Embeddings
title_fullStr Identification of Protein Subcellular Localization With Network and Functional Embeddings
title_full_unstemmed Identification of Protein Subcellular Localization With Network and Functional Embeddings
title_short Identification of Protein Subcellular Localization With Network and Functional Embeddings
title_sort identification of protein subcellular localization with network and functional embeddings
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7873866/
https://www.ncbi.nlm.nih.gov/pubmed/33584818
http://dx.doi.org/10.3389/fgene.2020.626500
work_keys_str_mv AT panxiaoyong identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT lihao identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT zengtao identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT lizhandong identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT chenlei identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT huangtao identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings
AT caiyudong identificationofproteinsubcellularlocalizationwithnetworkandfunctionalembeddings