Cargando…

Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data

Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wei, Jianxiang, Feng, Guanzhong, Lu, Zhiqiang, Han, Pu, Zhu, Yunxia, Huang, Weidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8418931/
https://www.ncbi.nlm.nih.gov/pubmed/34493954
http://dx.doi.org/10.1155/2021/6033860
_version_ 1783748659087671296
author Wei, Jianxiang
Feng, Guanzhong
Lu, Zhiqiang
Han, Pu
Zhu, Yunxia
Huang, Weidong
author_facet Wei, Jianxiang
Feng, Guanzhong
Lu, Zhiqiang
Han, Pu
Zhu, Yunxia
Huang, Weidong
author_sort Wei, Jianxiang
collection PubMed
description Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely, prescription drugs (Rx), over-the-counter drugs A (OTC-A), and over-the-counter drugs B (OTC-B). Until now, there has been a lack of automated evaluation methods for the three status switch of drugs. Based on China Food and Drug Administration's (CFDA) spontaneous reporting database (CSRD), we proposed a classification model to predict risk level of drugs by using feature enhancement based on Generative Adversarial Networks (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE). A total of 985,960 spontaneous reports from 2011 to 2018 were selected from CSRD in Jiangsu Province as experimental data. After data preprocessing, a class-imbalance data set was obtained, which contained 887 Rx (accounting for 84.72%), 113 OTC-A (10.79%), and 47 OTC-B (4.49%). Taking drugs as the samples, ADRs as the features, and signal detection results obtained by proportional reporting ratio (PRR) method as the feature values, we constructed the original data matrix, where the last column represents the category label of each drug. Our proposed model expands the ADR data from both the sample space and the feature space. In terms of feature space, we use feature selection (FS) to screen ADR symptoms with higher importance scores. Then, we use GAN to generate artificial data, which are added to the feature space to achieve feature enhancement. In terms of sample space, we use SMOTE technology to expand the minority samples to balance three categories of drugs and minimize the classification deviation caused by the gap in the sample size. Finally, we use random forest (RF) algorithm to classify the feature-enhanced and balanced data set. The experimental results show that the accuracy of the proposed classification model reaches 98%. Our proposed model can well evaluate drug risk levels and provide automated methods for status switch of post-marketing drugs.
format Online
Article
Text
id pubmed-8418931
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-84189312021-09-06 Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data Wei, Jianxiang Feng, Guanzhong Lu, Zhiqiang Han, Pu Zhu, Yunxia Huang, Weidong J Healthc Eng Research Article Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely, prescription drugs (Rx), over-the-counter drugs A (OTC-A), and over-the-counter drugs B (OTC-B). Until now, there has been a lack of automated evaluation methods for the three status switch of drugs. Based on China Food and Drug Administration's (CFDA) spontaneous reporting database (CSRD), we proposed a classification model to predict risk level of drugs by using feature enhancement based on Generative Adversarial Networks (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE). A total of 985,960 spontaneous reports from 2011 to 2018 were selected from CSRD in Jiangsu Province as experimental data. After data preprocessing, a class-imbalance data set was obtained, which contained 887 Rx (accounting for 84.72%), 113 OTC-A (10.79%), and 47 OTC-B (4.49%). Taking drugs as the samples, ADRs as the features, and signal detection results obtained by proportional reporting ratio (PRR) method as the feature values, we constructed the original data matrix, where the last column represents the category label of each drug. Our proposed model expands the ADR data from both the sample space and the feature space. In terms of feature space, we use feature selection (FS) to screen ADR symptoms with higher importance scores. Then, we use GAN to generate artificial data, which are added to the feature space to achieve feature enhancement. In terms of sample space, we use SMOTE technology to expand the minority samples to balance three categories of drugs and minimize the classification deviation caused by the gap in the sample size. Finally, we use random forest (RF) algorithm to classify the feature-enhanced and balanced data set. The experimental results show that the accuracy of the proposed classification model reaches 98%. Our proposed model can well evaluate drug risk levels and provide automated methods for status switch of post-marketing drugs. Hindawi 2021-08-27 /pmc/articles/PMC8418931/ /pubmed/34493954 http://dx.doi.org/10.1155/2021/6033860 Text en Copyright © 2021 Jianxiang Wei et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wei, Jianxiang
Feng, Guanzhong
Lu, Zhiqiang
Han, Pu
Zhu, Yunxia
Huang, Weidong
Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title_full Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title_fullStr Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title_full_unstemmed Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title_short Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
title_sort evaluating drug risk using gan and smote based on cfda's spontaneous reporting data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8418931/
https://www.ncbi.nlm.nih.gov/pubmed/34493954
http://dx.doi.org/10.1155/2021/6033860
work_keys_str_mv AT weijianxiang evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata
AT fengguanzhong evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata
AT luzhiqiang evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata
AT hanpu evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata
AT zhuyunxia evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata
AT huangweidong evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata