Cargando…
Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing
Towards the objectives of the UnitedStates Food and Drug Administration (FDA) generic drug science and research program, it is of vital importance in developing product-specific guidances (PSGs) with recommendations that can facilitate and guide generic product development. To generate a PSG, the as...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8222600/ https://www.ncbi.nlm.nih.gov/pubmed/34179681 http://dx.doi.org/10.3389/frma.2021.670006 |
_version_ | 1783711518044454912 |
---|---|
author | Shi, Yiwen Ren, Ping Zhang, Yi Gong, Xiajing Hu, Meng Liang, Hualou |
author_facet | Shi, Yiwen Ren, Ping Zhang, Yi Gong, Xiajing Hu, Meng Liang, Hualou |
author_sort | Shi, Yiwen |
collection | PubMed |
description | Towards the objectives of the UnitedStates Food and Drug Administration (FDA) generic drug science and research program, it is of vital importance in developing product-specific guidances (PSGs) with recommendations that can facilitate and guide generic product development. To generate a PSG, the assessor needs to retrieve supportive information about the drug product of interest, including from the drug labeling, which contain comprehensive information about drug products and instructions to physicians on how to use the products for treatment. Currently, although there are many drug labeling data resources, none of them including those developed by the FDA (e.g., Drugs@FDA) can cover all the FDA-approved drug products. Furthermore, these resources, housed in various locations, are often in forms that are not compatible or interoperable with each other. Therefore, there is a great demand for retrieving useful information from a large number of textual documents from different data resources to support an effective PSG development. To meet the needs, we developed a Natural Language Processing (NLP) pipeline by integrating multiple disparate publicly available data resources to extract drug product information with minimal human intervention. We provided a case study for identifying food effect information to illustrate how a machine learning model is employed to achieve accurate paragraph labeling. We showed that the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is able to outperform the traditional machine learning techniques, setting a new state-of-the-art for labelling food effect paragraphs from drug labeling and approved drug products datasets. |
format | Online Article Text |
id | pubmed-8222600 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82226002021-06-25 Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing Shi, Yiwen Ren, Ping Zhang, Yi Gong, Xiajing Hu, Meng Liang, Hualou Front Res Metr Anal Research Metrics and Analytics Towards the objectives of the UnitedStates Food and Drug Administration (FDA) generic drug science and research program, it is of vital importance in developing product-specific guidances (PSGs) with recommendations that can facilitate and guide generic product development. To generate a PSG, the assessor needs to retrieve supportive information about the drug product of interest, including from the drug labeling, which contain comprehensive information about drug products and instructions to physicians on how to use the products for treatment. Currently, although there are many drug labeling data resources, none of them including those developed by the FDA (e.g., Drugs@FDA) can cover all the FDA-approved drug products. Furthermore, these resources, housed in various locations, are often in forms that are not compatible or interoperable with each other. Therefore, there is a great demand for retrieving useful information from a large number of textual documents from different data resources to support an effective PSG development. To meet the needs, we developed a Natural Language Processing (NLP) pipeline by integrating multiple disparate publicly available data resources to extract drug product information with minimal human intervention. We provided a case study for identifying food effect information to illustrate how a machine learning model is employed to achieve accurate paragraph labeling. We showed that the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is able to outperform the traditional machine learning techniques, setting a new state-of-the-art for labelling food effect paragraphs from drug labeling and approved drug products datasets. Frontiers Media S.A. 2021-06-10 /pmc/articles/PMC8222600/ /pubmed/34179681 http://dx.doi.org/10.3389/frma.2021.670006 Text en Copyright © 2021 Shi, Ren, Zhang, Gong, Hu and Liang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Research Metrics and Analytics Shi, Yiwen Ren, Ping Zhang, Yi Gong, Xiajing Hu, Meng Liang, Hualou Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title | Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title_full | Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title_fullStr | Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title_full_unstemmed | Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title_short | Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing |
title_sort | information extraction from fda drug labeling to enhance product-specific guidance assessment using natural language processing |
topic | Research Metrics and Analytics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8222600/ https://www.ncbi.nlm.nih.gov/pubmed/34179681 http://dx.doi.org/10.3389/frma.2021.670006 |
work_keys_str_mv | AT shiyiwen informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing AT renping informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing AT zhangyi informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing AT gongxiajing informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing AT humeng informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing AT lianghualou informationextractionfromfdadruglabelingtoenhanceproductspecificguidanceassessmentusingnaturallanguageprocessing |