Cargando…
EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks
G Protein-Coupled Receptors (GPCRs) are one of the largest membrane protein receptor family in human, which are also important targets for many drugs. Thence, it’s of great significance to judge whether a protein is a GPCR or not. However, identifying GPCRs by experimental methods is very expensive...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8437786/ https://www.ncbi.nlm.nih.gov/pubmed/34527200 http://dx.doi.org/10.1016/j.csbj.2021.08.044 |
_version_ | 1783752228450861056 |
---|---|
author | Qiu, Wangren Lv, Zhe Xiao, Xuan Shao, Shuai Lin, Hao |
author_facet | Qiu, Wangren Lv, Zhe Xiao, Xuan Shao, Shuai Lin, Hao |
author_sort | Qiu, Wangren |
collection | PubMed |
description | G Protein-Coupled Receptors (GPCRs) are one of the largest membrane protein receptor family in human, which are also important targets for many drugs. Thence, it’s of great significance to judge whether a protein is a GPCR or not. However, identifying GPCRs by experimental methods is very expensive and time-consuming. As more and more GPCR primary sequences are accumulated, it’s feasible to develop a computational model to predict GPCRs precisely and quickly. In this paper, a novel method called EMCBOW-GPCR has been proposed to improve the accuracy of identifying GPCRs based on natural language processing (NLP). For representing GPCRs, three word-embedding models and a bag-of-words model are used to extract original features. Then, the original features are thrown into a Deep-learning algorithm to extract features further and reduce the dimension. Finally, the obtained features are fed into Extreme Gradient Boosting. As shown with the results comparison, the overall prediction metrics of EMCBOW-GPCR are higher than the state of the arts. In order to be convenient for more researchers to use EMCBOW-GPCR, the method and source code have been opened in github, which are available at https://github.com/454170054/EMCBOW-GPCR, and a user-friendly web-server for EMCBOW-GPCR has been established at http://www.jci-bioinfo.cn/emcbowgpcr. |
format | Online Article Text |
id | pubmed-8437786 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-84377862021-09-14 EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks Qiu, Wangren Lv, Zhe Xiao, Xuan Shao, Shuai Lin, Hao Comput Struct Biotechnol J Research Article G Protein-Coupled Receptors (GPCRs) are one of the largest membrane protein receptor family in human, which are also important targets for many drugs. Thence, it’s of great significance to judge whether a protein is a GPCR or not. However, identifying GPCRs by experimental methods is very expensive and time-consuming. As more and more GPCR primary sequences are accumulated, it’s feasible to develop a computational model to predict GPCRs precisely and quickly. In this paper, a novel method called EMCBOW-GPCR has been proposed to improve the accuracy of identifying GPCRs based on natural language processing (NLP). For representing GPCRs, three word-embedding models and a bag-of-words model are used to extract original features. Then, the original features are thrown into a Deep-learning algorithm to extract features further and reduce the dimension. Finally, the obtained features are fed into Extreme Gradient Boosting. As shown with the results comparison, the overall prediction metrics of EMCBOW-GPCR are higher than the state of the arts. In order to be convenient for more researchers to use EMCBOW-GPCR, the method and source code have been opened in github, which are available at https://github.com/454170054/EMCBOW-GPCR, and a user-friendly web-server for EMCBOW-GPCR has been established at http://www.jci-bioinfo.cn/emcbowgpcr. Research Network of Computational and Structural Biotechnology 2021-08-31 /pmc/articles/PMC8437786/ /pubmed/34527200 http://dx.doi.org/10.1016/j.csbj.2021.08.044 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Qiu, Wangren Lv, Zhe Xiao, Xuan Shao, Shuai Lin, Hao EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title | EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title_full | EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title_fullStr | EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title_full_unstemmed | EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title_short | EMCBOW-GPCR: A method for identifying G-protein coupled receptors based on word embedding and wordbooks |
title_sort | emcbow-gpcr: a method for identifying g-protein coupled receptors based on word embedding and wordbooks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8437786/ https://www.ncbi.nlm.nih.gov/pubmed/34527200 http://dx.doi.org/10.1016/j.csbj.2021.08.044 |
work_keys_str_mv | AT qiuwangren emcbowgpcramethodforidentifyinggproteincoupledreceptorsbasedonwordembeddingandwordbooks AT lvzhe emcbowgpcramethodforidentifyinggproteincoupledreceptorsbasedonwordembeddingandwordbooks AT xiaoxuan emcbowgpcramethodforidentifyinggproteincoupledreceptorsbasedonwordembeddingandwordbooks AT shaoshuai emcbowgpcramethodforidentifyinggproteincoupledreceptorsbasedonwordembeddingandwordbooks AT linhao emcbowgpcramethodforidentifyinggproteincoupledreceptorsbasedonwordembeddingandwordbooks |