Cargando…

Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore

INTRODUCTION: Substandard medicines are medicines that fail to meet their quality standards and/or specifications. Substandard medicines can lead to serious safety issues affecting public health. With the increasing number of pharmaceuticals and the complexity of the pharmaceutical manufacturing sup...

Descripción completa

Detalles Bibliográficos
Autores principales: Ang, Pei San, Teo, Desmond Chun Hwee, Dorajoo, Sreemanee Raaj, Prem Kumar, Mukundaram, Chan, Yi Hao, Choong, Chih Tzer, Phuah, Doris Sock Tin, Tan, Dorothy Hooi Myn, Tan, Filina Meixuan, Huang, Huilin, Tan, Maggie Siok Hwee, Ng, Michelle Sau Yuen, Poh, Jalene Wang Woon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8214454/
https://www.ncbi.nlm.nih.gov/pubmed/34148223
http://dx.doi.org/10.1007/s40264-021-01084-w
_version_ 1783710066345508864
author Ang, Pei San
Teo, Desmond Chun Hwee
Dorajoo, Sreemanee Raaj
Prem Kumar, Mukundaram
Chan, Yi Hao
Choong, Chih Tzer
Phuah, Doris Sock Tin
Tan, Dorothy Hooi Myn
Tan, Filina Meixuan
Huang, Huilin
Tan, Maggie Siok Hwee
Ng, Michelle Sau Yuen
Poh, Jalene Wang Woon
author_facet Ang, Pei San
Teo, Desmond Chun Hwee
Dorajoo, Sreemanee Raaj
Prem Kumar, Mukundaram
Chan, Yi Hao
Choong, Chih Tzer
Phuah, Doris Sock Tin
Tan, Dorothy Hooi Myn
Tan, Filina Meixuan
Huang, Huilin
Tan, Maggie Siok Hwee
Ng, Michelle Sau Yuen
Poh, Jalene Wang Woon
author_sort Ang, Pei San
collection PubMed
description INTRODUCTION: Substandard medicines are medicines that fail to meet their quality standards and/or specifications. Substandard medicines can lead to serious safety issues affecting public health. With the increasing number of pharmaceuticals and the complexity of the pharmaceutical manufacturing supply chain, monitoring for substandard medicines via manual environmental scanning can be laborious and time consuming. METHODS: A web crawler was developed to automatically detect and extract alerts on substandard medicines published on the Internet by regulatory agencies. The crawled data were labelled as related to substandard medicines or not. An expert-derived keyword-based classification algorithm was compared against machine learning algorithms to identify substandard medicine alerts on two validation datasets (n = 4920 and n = 2458) from a later time period than training data. Models were comparatively assessed for recall, precision and their F1 scores (harmonic mean of precision and recall). RESULTS: The web crawler routinely extracted alerts from the 46 web pages belonging to nine regulatory agencies. From October 2019 to May 2020, 12,156 unique alerts were crawled of which 7378 (60.7%) alerts were set aside for validation and contained 1160 substandard medicine alerts (15.7%). An ensemble approach of combining machine learning and keywords achieved the best recall (94% and 97%), precision (85% and 80%) and F1 scores (89% and 88%) on temporal validation. CONCLUSIONS: Combining robust web crawler programmes with rigorously tested filtering algorithms based on machine learning and keyword models can automate and expand horizon scanning capabilities for issues relating to substandard medicines. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s40264-021-01084-w.
format Online
Article
Text
id pubmed-8214454
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-82144542021-06-21 Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore Ang, Pei San Teo, Desmond Chun Hwee Dorajoo, Sreemanee Raaj Prem Kumar, Mukundaram Chan, Yi Hao Choong, Chih Tzer Phuah, Doris Sock Tin Tan, Dorothy Hooi Myn Tan, Filina Meixuan Huang, Huilin Tan, Maggie Siok Hwee Ng, Michelle Sau Yuen Poh, Jalene Wang Woon Drug Saf Original Research Article INTRODUCTION: Substandard medicines are medicines that fail to meet their quality standards and/or specifications. Substandard medicines can lead to serious safety issues affecting public health. With the increasing number of pharmaceuticals and the complexity of the pharmaceutical manufacturing supply chain, monitoring for substandard medicines via manual environmental scanning can be laborious and time consuming. METHODS: A web crawler was developed to automatically detect and extract alerts on substandard medicines published on the Internet by regulatory agencies. The crawled data were labelled as related to substandard medicines or not. An expert-derived keyword-based classification algorithm was compared against machine learning algorithms to identify substandard medicine alerts on two validation datasets (n = 4920 and n = 2458) from a later time period than training data. Models were comparatively assessed for recall, precision and their F1 scores (harmonic mean of precision and recall). RESULTS: The web crawler routinely extracted alerts from the 46 web pages belonging to nine regulatory agencies. From October 2019 to May 2020, 12,156 unique alerts were crawled of which 7378 (60.7%) alerts were set aside for validation and contained 1160 substandard medicine alerts (15.7%). An ensemble approach of combining machine learning and keywords achieved the best recall (94% and 97%), precision (85% and 80%) and F1 scores (89% and 88%) on temporal validation. CONCLUSIONS: Combining robust web crawler programmes with rigorously tested filtering algorithms based on machine learning and keyword models can automate and expand horizon scanning capabilities for issues relating to substandard medicines. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s40264-021-01084-w. Springer International Publishing 2021-06-19 2021 /pmc/articles/PMC8214454/ /pubmed/34148223 http://dx.doi.org/10.1007/s40264-021-01084-w Text en © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research Article
Ang, Pei San
Teo, Desmond Chun Hwee
Dorajoo, Sreemanee Raaj
Prem Kumar, Mukundaram
Chan, Yi Hao
Choong, Chih Tzer
Phuah, Doris Sock Tin
Tan, Dorothy Hooi Myn
Tan, Filina Meixuan
Huang, Huilin
Tan, Maggie Siok Hwee
Ng, Michelle Sau Yuen
Poh, Jalene Wang Woon
Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title_full Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title_fullStr Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title_full_unstemmed Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title_short Augmenting Product Defect Surveillance Through Web Crawling and Machine Learning in Singapore
title_sort augmenting product defect surveillance through web crawling and machine learning in singapore
topic Original Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8214454/
https://www.ncbi.nlm.nih.gov/pubmed/34148223
http://dx.doi.org/10.1007/s40264-021-01084-w
work_keys_str_mv AT angpeisan augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT teodesmondchunhwee augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT dorajoosreemaneeraaj augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT premkumarmukundaram augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT chanyihao augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT choongchihtzer augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT phuahdorissocktin augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT tandorothyhooimyn augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT tanfilinameixuan augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT huanghuilin augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT tanmaggiesiokhwee augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT ngmichellesauyuen augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore
AT pohjalenewangwoon augmentingproductdefectsurveillancethroughwebcrawlingandmachinelearninginsingapore