Cargando…

DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction

Drug labeling contains an ‘INDICATIONS AND USAGE’ that provides vital information to support clinical decision making and regulatory management. Effective extraction of drug indication information from free-text based resources could facilitate drug repositioning projects and help collect real-world...

Descripción completa

Detalles Bibliográficos
Autores principales: Bhatt, Arjun, Roberts, Ruth, Chen, Xi, Li, Ting, Connor, Skylar, Hatim, Qais, Mikailov, Mike, Tong, Weida, Liu, Zhichao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8366025/
https://www.ncbi.nlm.nih.gov/pubmed/34409286
http://dx.doi.org/10.3389/frai.2021.711467
_version_ 1783738825683501056
author Bhatt, Arjun
Roberts, Ruth
Chen, Xi
Li, Ting
Connor, Skylar
Hatim, Qais
Mikailov, Mike
Tong, Weida
Liu, Zhichao
author_facet Bhatt, Arjun
Roberts, Ruth
Chen, Xi
Li, Ting
Connor, Skylar
Hatim, Qais
Mikailov, Mike
Tong, Weida
Liu, Zhichao
author_sort Bhatt, Arjun
collection PubMed
description Drug labeling contains an ‘INDICATIONS AND USAGE’ that provides vital information to support clinical decision making and regulatory management. Effective extraction of drug indication information from free-text based resources could facilitate drug repositioning projects and help collect real-world evidence in support of secondary use of approved medicines. To enable AI-powered language models for the extraction of drug indication information, we used manual reading and curation to develop a Drug Indication Classification and Encyclopedia (DICE) based on FDA approved human prescription drug labeling. A DICE scheme with 7,231 sentences categorized into five classes (indications, contradictions, side effects, usage instructions, and clinical observations) was developed. To further elucidate the utility of the DICE, we developed nine different AI-based classifiers for the prediction of indications based on the developed DICE to comprehensively assess their performance. We found that the transformer-based language models yielded an average MCC of 0.887, outperforming the word embedding-based Bidirectional long short-term memory (BiLSTM) models (0.862) with a 2.82% improvement on the test set. The best classifiers were also used to extract drug indication information in DrugBank and achieved a high enrichment rate (>0.930) for this task. We found that domain-specific training could provide more explainable models without performance sacrifices and better generalization for external validation datasets. Altogether, the proposed DICE could be a standard resource for the development and evaluation of task-specific AI-powered, natural language processing (NLP) models.
format Online
Article
Text
id pubmed-8366025
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-83660252021-08-17 DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction Bhatt, Arjun Roberts, Ruth Chen, Xi Li, Ting Connor, Skylar Hatim, Qais Mikailov, Mike Tong, Weida Liu, Zhichao Front Artif Intell Artificial Intelligence Drug labeling contains an ‘INDICATIONS AND USAGE’ that provides vital information to support clinical decision making and regulatory management. Effective extraction of drug indication information from free-text based resources could facilitate drug repositioning projects and help collect real-world evidence in support of secondary use of approved medicines. To enable AI-powered language models for the extraction of drug indication information, we used manual reading and curation to develop a Drug Indication Classification and Encyclopedia (DICE) based on FDA approved human prescription drug labeling. A DICE scheme with 7,231 sentences categorized into five classes (indications, contradictions, side effects, usage instructions, and clinical observations) was developed. To further elucidate the utility of the DICE, we developed nine different AI-based classifiers for the prediction of indications based on the developed DICE to comprehensively assess their performance. We found that the transformer-based language models yielded an average MCC of 0.887, outperforming the word embedding-based Bidirectional long short-term memory (BiLSTM) models (0.862) with a 2.82% improvement on the test set. The best classifiers were also used to extract drug indication information in DrugBank and achieved a high enrichment rate (>0.930) for this task. We found that domain-specific training could provide more explainable models without performance sacrifices and better generalization for external validation datasets. Altogether, the proposed DICE could be a standard resource for the development and evaluation of task-specific AI-powered, natural language processing (NLP) models. Frontiers Media S.A. 2021-08-02 /pmc/articles/PMC8366025/ /pubmed/34409286 http://dx.doi.org/10.3389/frai.2021.711467 Text en Copyright © 2021 Bhatt, Roberts, Chen, Li, Connor, Hatim, Mikailov, Tong and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Bhatt, Arjun
Roberts, Ruth
Chen, Xi
Li, Ting
Connor, Skylar
Hatim, Qais
Mikailov, Mike
Tong, Weida
Liu, Zhichao
DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title_full DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title_fullStr DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title_full_unstemmed DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title_short DICE: A Drug Indication Classification and Encyclopedia for AI-Based Indication Extraction
title_sort dice: a drug indication classification and encyclopedia for ai-based indication extraction
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8366025/
https://www.ncbi.nlm.nih.gov/pubmed/34409286
http://dx.doi.org/10.3389/frai.2021.711467
work_keys_str_mv AT bhattarjun diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT robertsruth diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT chenxi diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT liting diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT connorskylar diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT hatimqais diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT mikailovmike diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT tongweida diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction
AT liuzhichao diceadrugindicationclassificationandencyclopediaforaibasedindicationextraction