Cargando…
Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature
OBJECTIVE: Develop a novel methodology to create a comprehensive knowledge graph (SuppKG) to represent a domain with limited coverage in the Unified Medical Language System (UMLS), specifically dietary supplement (DS) information for discovering drug-supplement interactions (DSI), by leveraging biom...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9335448/ https://www.ncbi.nlm.nih.gov/pubmed/35709900 http://dx.doi.org/10.1016/j.jbi.2022.104120 |
_version_ | 1784759343313321984 |
---|---|
author | Schutte, Dalton Vasilakes, Jake Bompelli, Anu Zhou, Yuqi Fiszman, Marcelo Xu, Hua Kilicoglu, Halil Bishop, Jeffrey R. Adam, Terrence Zhang, Rui |
author_facet | Schutte, Dalton Vasilakes, Jake Bompelli, Anu Zhou, Yuqi Fiszman, Marcelo Xu, Hua Kilicoglu, Halil Bishop, Jeffrey R. Adam, Terrence Zhang, Rui |
author_sort | Schutte, Dalton |
collection | PubMed |
description | OBJECTIVE: Develop a novel methodology to create a comprehensive knowledge graph (SuppKG) to represent a domain with limited coverage in the Unified Medical Language System (UMLS), specifically dietary supplement (DS) information for discovering drug-supplement interactions (DSI), by leveraging biomedical natural language processing (NLP) technologies and a DS domain terminology. MATERIALS AND METHODS: We created SemRepDS (an extension of an NLP tool, SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT model to remove incorrect relations before generating SuppKG. Two discovery pathways were applied to SuppKG to identify potential DSIs, which are then compared with an existing DSI database and also evaluated by medical professionals for mechanistic plausibility. RESULTS: SemRepDS returned 158.5% more DS entities and 206.9% more DS relations than SemRep. The fine-tuned PubMedBERT model (significantly outperformed other machine learning and BERT models) obtained an F1 score of 0.8605 and removed 43.86% of semantic relations, improving the precision of the relations by 26.4% over pre-filtering. SuppKG consists of 56,635 nodes and 595,222 directed edges with 2,928 DS-specific nodes and 164,738 edges. Manual review of findings identified 182 of 250 (72.8%) proposed DS-Gene-Drug and 77 of 100 (77%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: With added DS terminology to the UMLS, SemRepDS has the capability to find more DS-specific semantic relationships from PubMed than SemRep. The utility of the resulting SuppKG was demonstrated using discovery patterns to find novel DSIs. CONCLUSION: For the domain with limited coverage in the traditional terminology (e.g., UMLS), we demonstrated an approach to leverage domain terminology and improve existing NLP tools to generate a more comprehensive knowledge graph for the downstream task. Even this study focuses on DSI, the method may be adapted to other domains. |
format | Online Article Text |
id | pubmed-9335448 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
record_format | MEDLINE/PubMed |
spelling | pubmed-93354482022-07-29 Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature Schutte, Dalton Vasilakes, Jake Bompelli, Anu Zhou, Yuqi Fiszman, Marcelo Xu, Hua Kilicoglu, Halil Bishop, Jeffrey R. Adam, Terrence Zhang, Rui J Biomed Inform Article OBJECTIVE: Develop a novel methodology to create a comprehensive knowledge graph (SuppKG) to represent a domain with limited coverage in the Unified Medical Language System (UMLS), specifically dietary supplement (DS) information for discovering drug-supplement interactions (DSI), by leveraging biomedical natural language processing (NLP) technologies and a DS domain terminology. MATERIALS AND METHODS: We created SemRepDS (an extension of an NLP tool, SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT model to remove incorrect relations before generating SuppKG. Two discovery pathways were applied to SuppKG to identify potential DSIs, which are then compared with an existing DSI database and also evaluated by medical professionals for mechanistic plausibility. RESULTS: SemRepDS returned 158.5% more DS entities and 206.9% more DS relations than SemRep. The fine-tuned PubMedBERT model (significantly outperformed other machine learning and BERT models) obtained an F1 score of 0.8605 and removed 43.86% of semantic relations, improving the precision of the relations by 26.4% over pre-filtering. SuppKG consists of 56,635 nodes and 595,222 directed edges with 2,928 DS-specific nodes and 164,738 edges. Manual review of findings identified 182 of 250 (72.8%) proposed DS-Gene-Drug and 77 of 100 (77%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: With added DS terminology to the UMLS, SemRepDS has the capability to find more DS-specific semantic relationships from PubMed than SemRep. The utility of the resulting SuppKG was demonstrated using discovery patterns to find novel DSIs. CONCLUSION: For the domain with limited coverage in the traditional terminology (e.g., UMLS), we demonstrated an approach to leverage domain terminology and improve existing NLP tools to generate a more comprehensive knowledge graph for the downstream task. Even this study focuses on DSI, the method may be adapted to other domains. 2022-07 2022-06-13 /pmc/articles/PMC9335448/ /pubmed/35709900 http://dx.doi.org/10.1016/j.jbi.2022.104120 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ). |
spellingShingle | Article Schutte, Dalton Vasilakes, Jake Bompelli, Anu Zhou, Yuqi Fiszman, Marcelo Xu, Hua Kilicoglu, Halil Bishop, Jeffrey R. Adam, Terrence Zhang, Rui Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title | Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title_full | Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title_fullStr | Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title_full_unstemmed | Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title_short | Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature |
title_sort | discovering novel drug-supplement interactions using suppkg generated from the biomedical literature |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9335448/ https://www.ncbi.nlm.nih.gov/pubmed/35709900 http://dx.doi.org/10.1016/j.jbi.2022.104120 |
work_keys_str_mv | AT schuttedalton discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT vasilakesjake discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT bompellianu discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT zhouyuqi discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT fiszmanmarcelo discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT xuhua discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT kilicogluhalil discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT bishopjeffreyr discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT adamterrence discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature AT zhangrui discoveringnoveldrugsupplementinteractionsusingsuppkggeneratedfromthebiomedicalliterature |