Cargando…

Multimodal reasoning based on knowledge graph embedding for specific diseases

MOTIVATION: Knowledge Graph (KG) is becoming increasingly important in the biomedical field. Deriving new and reliable knowledge from existing knowledge by KG embedding technology is a cutting-edge method. Some add a variety of additional information to aid reasoning, namely multimodal reasoning. Ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Chaoyu, Yang, Zhihao, Xia, Xiaoqiong, Li, Nan, Zhong, Fan, Liu, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9004655/
https://www.ncbi.nlm.nih.gov/pubmed/35150235
http://dx.doi.org/10.1093/bioinformatics/btac085
Descripción
Sumario:MOTIVATION: Knowledge Graph (KG) is becoming increasingly important in the biomedical field. Deriving new and reliable knowledge from existing knowledge by KG embedding technology is a cutting-edge method. Some add a variety of additional information to aid reasoning, namely multimodal reasoning. However, few works based on the existing biomedical KGs are focused on specific diseases. RESULTS: This work develops a construction and multimodal reasoning process of Specific Disease Knowledge Graphs (SDKGs). We construct SDKG-11, a SDKG set including five cancers, six non-cancer diseases, a combined Cancer5 and a combined Diseases11, aiming to discover new reliable knowledge and provide universal pre-trained knowledge for that specific disease field. SDKG-11 is obtained through original triplet extraction, standard entity set construction, entity linking and relation linking. We implement multimodal reasoning by reverse-hyperplane projection for SDKGs based on structure, category and description embeddings. Multimodal reasoning improves pre-existing models on all SDKGs using entity prediction task as the evaluation protocol. We verify the model’s reliability in discovering new knowledge by manually proofreading predicted drug–gene, gene–disease and disease–drug pairs. Using embedding results as initialization parameters for the biomolecular interaction classification, we demonstrate the universality of embedding models. AVAILABILITY AND IMPLEMENTATION: The constructed SDKG-11 and the implementation by TensorFlow are available from https://github.com/ZhuChaoY/SDKG-11. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.