Cargando…

Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM

Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limit...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Liwen, Gao, Song, Yao, Shaowen, Wu, Feng, Li, Jie, Dong, Yunyun, Zhang, Yunqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240597/
https://www.ncbi.nlm.nih.gov/pubmed/35783287
http://dx.doi.org/10.3389/fgene.2022.912614
_version_ 1784737599725764608
author Wu, Liwen
Gao, Song
Yao, Shaowen
Wu, Feng
Li, Jie
Dong, Yunyun
Zhang, Yunqi
author_facet Wu, Liwen
Gao, Song
Yao, Shaowen
Wu, Feng
Li, Jie
Dong, Yunyun
Zhang, Yunqi
author_sort Wu, Liwen
collection PubMed
description Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization.
format Online
Article
Text
id pubmed-9240597
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92405972022-06-30 Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM Wu, Liwen Gao, Song Yao, Shaowen Wu, Feng Li, Jie Dong, Yunyun Zhang, Yunqi Front Genet Genetics Identifying the subcellular localization of a given protein is an essential part of biological and medical research, since the protein must be localized in the correct organelle to ensure physiological function. Conventional biological experiments for protein subcellular localization have some limitations, such as high cost and low efficiency, thus massive computational methods are proposed to solve these problems. However, some of these methods need to be improved further for protein subcellular localization with class imbalance problem. We propose a new model, generating minority samples for protein subcellular localization (Gm-PLoc), to predict the subcellular localization of multi-label proteins. This model includes three steps: using the position specific scoring matrix to extract distinguishable features of proteins; synthesizing samples of the minority category to balance the distribution of categories based on the revised generative adversarial networks; training a classifier with the rebalanced dataset to predict the subcellular localization of multi-label proteins. One benchmark dataset is selected to evaluate the performance of the presented model, and the experimental results demonstrate that Gm-PLoc performs well for the multi-label protein subcellular localization. Frontiers Media S.A. 2022-06-15 /pmc/articles/PMC9240597/ /pubmed/35783287 http://dx.doi.org/10.3389/fgene.2022.912614 Text en Copyright © 2022 Wu, Gao, Yao, Wu, Li, Dong and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wu, Liwen
Gao, Song
Yao, Shaowen
Wu, Feng
Li, Jie
Dong, Yunyun
Zhang, Yunqi
Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_full Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_fullStr Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_full_unstemmed Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_short Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM
title_sort gm-ploc: a subcellular localization model of multi-label protein based on gan and deepfm
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9240597/
https://www.ncbi.nlm.nih.gov/pubmed/35783287
http://dx.doi.org/10.3389/fgene.2022.912614
work_keys_str_mv AT wuliwen gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT gaosong gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT yaoshaowen gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT wufeng gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT lijie gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT dongyunyun gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm
AT zhangyunqi gmplocasubcellularlocalizationmodelofmultilabelproteinbasedongananddeepfm