Cargando…

Function Prediction for G Protein-Coupled Receptors through Text Mining and Induction Matrix Completion

[Image: see text] G protein-coupled receptors (GPCRs) constitute the key component of cellular signal transduction. Accurately annotating the biological functions of GPCR proteins is vital to the understanding of the physiological processes they involve in. With the rapid development of text mining...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jiansheng, Yin, Qin, Zhang, Chengxin, Geng, Jingjing, Wu, Hongjie, Hu, Haifeng, Ke, Xiaoyan, Zhang, Yang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2019
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6649004/
https://www.ncbi.nlm.nih.gov/pubmed/31459527
http://dx.doi.org/10.1021/acsomega.8b02454
Descripción
Sumario:[Image: see text] G protein-coupled receptors (GPCRs) constitute the key component of cellular signal transduction. Accurately annotating the biological functions of GPCR proteins is vital to the understanding of the physiological processes they involve in. With the rapid development of text mining technologies and the exponential growth of biomedical literature, it becomes urgent to explore biological functional information from various literature for systematically and reliably annotating these known GPCRs. We design a novel three-stage approach, TM–IMC, using text mining and inductive matrix completion, for automated prediction of the gene ontology (GO) terms of the GPCR proteins. Large-scale benchmark tests show that inductive matrix completion models contribute to GPCR-GO association prediction for both molecular function and biological process aspects. Moreover, our detailed data analysis shows that information extracted from GPCR-associated literature indeed contributes to the prediction of GPCR–GO associations. The study demonstrated a new avenue to enhance the accuracy of GPCR function annotation through the combination of text mining and induction matrix completion over baseline methods in critical assessment of protein function annotation algorithms and literature-based GO annotation methods. Source codes of TM–IMC and the involved datasets can be freely downloaded from https://zhanglab.ccmb.med.umich.edu/TM-IMC for academic purposes.