Cargando…

A natural language processing approach towards harmonisation of European medicinal product information

Product information (PI) is a vital part of any medicinal product approved for use within the European Union and consists of a summary of products characteristics (SmPC) for healthcare professionals and package leaflet (PL) for patients, together with the product packaging. In this study, based on t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bergman, Erik, Sherwood, Kim, Forslund, Markus, Arlett, Peter, Westman, Gabriel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9584511/ https://www.ncbi.nlm.nih.gov/pubmed/36264941 http://dx.doi.org/10.1371/journal.pone.0275386

_version_	1784813282932031488
author	Bergman, Erik Sherwood, Kim Forslund, Markus Arlett, Peter Westman, Gabriel
author_facet	Bergman, Erik Sherwood, Kim Forslund, Markus Arlett, Peter Westman, Gabriel
author_sort	Bergman, Erik
collection	PubMed
description	Product information (PI) is a vital part of any medicinal product approved for use within the European Union and consists of a summary of products characteristics (SmPC) for healthcare professionals and package leaflet (PL) for patients, together with the product packaging. In this study, based on the English corpus of the EMA product information documents for all centrally approved medicinal products within the EU, a BERT sentence embedding model was used together with clustering and dimensional reduction techniques to identify sentence similarity clusters that could be candidates for standardization. A total of 1258 medicinal products were included in the study. From these, a total of 783 K sentences were extracted from SmPC and PL documents which were aggregated into a total of 284 and 129 semantic similarity clusters, respectively. The spread distribution among clusters shows separation into different cluster types. Examples of clusters with low spread include those with identical word embeddings due to current standardization, such as section headings and standard phrases. Others show minor linguistic variations, while the group with the largest variability contains variable wording but with significant semantic overlap. The sentence clusters identified could serve as candidates for further standardization of the PI. Moving from free text human wording to auto-generated text elements based on multiple-choice input for appropriate parts of the package leaflet and summary of product characteristics, could reduce both time and complexity for applicants as well as regulators, and ultimately provide patients and prescribers with documents that are easier to understand and better adapted for search availabilities.
format	Online Article Text
id	pubmed-9584511
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-95845112022-10-21 A natural language processing approach towards harmonisation of European medicinal product information Bergman, Erik Sherwood, Kim Forslund, Markus Arlett, Peter Westman, Gabriel PLoS One Research Article Product information (PI) is a vital part of any medicinal product approved for use within the European Union and consists of a summary of products characteristics (SmPC) for healthcare professionals and package leaflet (PL) for patients, together with the product packaging. In this study, based on the English corpus of the EMA product information documents for all centrally approved medicinal products within the EU, a BERT sentence embedding model was used together with clustering and dimensional reduction techniques to identify sentence similarity clusters that could be candidates for standardization. A total of 1258 medicinal products were included in the study. From these, a total of 783 K sentences were extracted from SmPC and PL documents which were aggregated into a total of 284 and 129 semantic similarity clusters, respectively. The spread distribution among clusters shows separation into different cluster types. Examples of clusters with low spread include those with identical word embeddings due to current standardization, such as section headings and standard phrases. Others show minor linguistic variations, while the group with the largest variability contains variable wording but with significant semantic overlap. The sentence clusters identified could serve as candidates for further standardization of the PI. Moving from free text human wording to auto-generated text elements based on multiple-choice input for appropriate parts of the package leaflet and summary of product characteristics, could reduce both time and complexity for applicants as well as regulators, and ultimately provide patients and prescribers with documents that are easier to understand and better adapted for search availabilities. Public Library of Science 2022-10-20 /pmc/articles/PMC9584511/ /pubmed/36264941 http://dx.doi.org/10.1371/journal.pone.0275386 Text en © 2022 Bergman et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Bergman, Erik Sherwood, Kim Forslund, Markus Arlett, Peter Westman, Gabriel A natural language processing approach towards harmonisation of European medicinal product information
title	A natural language processing approach towards harmonisation of European medicinal product information
title_full	A natural language processing approach towards harmonisation of European medicinal product information
title_fullStr	A natural language processing approach towards harmonisation of European medicinal product information
title_full_unstemmed	A natural language processing approach towards harmonisation of European medicinal product information
title_short	A natural language processing approach towards harmonisation of European medicinal product information
title_sort	natural language processing approach towards harmonisation of european medicinal product information
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9584511/ https://www.ncbi.nlm.nih.gov/pubmed/36264941 http://dx.doi.org/10.1371/journal.pone.0275386
work_keys_str_mv	AT bergmanerik anaturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT sherwoodkim anaturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT forslundmarkus anaturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT arlettpeter anaturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT westmangabriel anaturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT bergmanerik naturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT sherwoodkim naturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT forslundmarkus naturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT arlettpeter naturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation AT westmangabriel naturallanguageprocessingapproachtowardsharmonisationofeuropeanmedicinalproductinformation

A natural language processing approach towards harmonisation of European medicinal product information

Ejemplares similares