Cargando…
Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data
Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely,...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8418931/ https://www.ncbi.nlm.nih.gov/pubmed/34493954 http://dx.doi.org/10.1155/2021/6033860 |
_version_ | 1783748659087671296 |
---|---|
author | Wei, Jianxiang Feng, Guanzhong Lu, Zhiqiang Han, Pu Zhu, Yunxia Huang, Weidong |
author_facet | Wei, Jianxiang Feng, Guanzhong Lu, Zhiqiang Han, Pu Zhu, Yunxia Huang, Weidong |
author_sort | Wei, Jianxiang |
collection | PubMed |
description | Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely, prescription drugs (Rx), over-the-counter drugs A (OTC-A), and over-the-counter drugs B (OTC-B). Until now, there has been a lack of automated evaluation methods for the three status switch of drugs. Based on China Food and Drug Administration's (CFDA) spontaneous reporting database (CSRD), we proposed a classification model to predict risk level of drugs by using feature enhancement based on Generative Adversarial Networks (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE). A total of 985,960 spontaneous reports from 2011 to 2018 were selected from CSRD in Jiangsu Province as experimental data. After data preprocessing, a class-imbalance data set was obtained, which contained 887 Rx (accounting for 84.72%), 113 OTC-A (10.79%), and 47 OTC-B (4.49%). Taking drugs as the samples, ADRs as the features, and signal detection results obtained by proportional reporting ratio (PRR) method as the feature values, we constructed the original data matrix, where the last column represents the category label of each drug. Our proposed model expands the ADR data from both the sample space and the feature space. In terms of feature space, we use feature selection (FS) to screen ADR symptoms with higher importance scores. Then, we use GAN to generate artificial data, which are added to the feature space to achieve feature enhancement. In terms of sample space, we use SMOTE technology to expand the minority samples to balance three categories of drugs and minimize the classification deviation caused by the gap in the sample size. Finally, we use random forest (RF) algorithm to classify the feature-enhanced and balanced data set. The experimental results show that the accuracy of the proposed classification model reaches 98%. Our proposed model can well evaluate drug risk levels and provide automated methods for status switch of post-marketing drugs. |
format | Online Article Text |
id | pubmed-8418931 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-84189312021-09-06 Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data Wei, Jianxiang Feng, Guanzhong Lu, Zhiqiang Han, Pu Zhu, Yunxia Huang, Weidong J Healthc Eng Research Article Adverse drug reactions (ADRs) pose health threats to humans. Therefore, the risk re-evaluation of post-marketing drugs has become an important part of the pharmacovigilance work of various countries. In China, drugs are mainly divided into three categories, from high-risk to low-risk drugs, namely, prescription drugs (Rx), over-the-counter drugs A (OTC-A), and over-the-counter drugs B (OTC-B). Until now, there has been a lack of automated evaluation methods for the three status switch of drugs. Based on China Food and Drug Administration's (CFDA) spontaneous reporting database (CSRD), we proposed a classification model to predict risk level of drugs by using feature enhancement based on Generative Adversarial Networks (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE). A total of 985,960 spontaneous reports from 2011 to 2018 were selected from CSRD in Jiangsu Province as experimental data. After data preprocessing, a class-imbalance data set was obtained, which contained 887 Rx (accounting for 84.72%), 113 OTC-A (10.79%), and 47 OTC-B (4.49%). Taking drugs as the samples, ADRs as the features, and signal detection results obtained by proportional reporting ratio (PRR) method as the feature values, we constructed the original data matrix, where the last column represents the category label of each drug. Our proposed model expands the ADR data from both the sample space and the feature space. In terms of feature space, we use feature selection (FS) to screen ADR symptoms with higher importance scores. Then, we use GAN to generate artificial data, which are added to the feature space to achieve feature enhancement. In terms of sample space, we use SMOTE technology to expand the minority samples to balance three categories of drugs and minimize the classification deviation caused by the gap in the sample size. Finally, we use random forest (RF) algorithm to classify the feature-enhanced and balanced data set. The experimental results show that the accuracy of the proposed classification model reaches 98%. Our proposed model can well evaluate drug risk levels and provide automated methods for status switch of post-marketing drugs. Hindawi 2021-08-27 /pmc/articles/PMC8418931/ /pubmed/34493954 http://dx.doi.org/10.1155/2021/6033860 Text en Copyright © 2021 Jianxiang Wei et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Wei, Jianxiang Feng, Guanzhong Lu, Zhiqiang Han, Pu Zhu, Yunxia Huang, Weidong Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title | Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title_full | Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title_fullStr | Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title_full_unstemmed | Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title_short | Evaluating Drug Risk Using GAN and SMOTE Based on CFDA's Spontaneous Reporting Data |
title_sort | evaluating drug risk using gan and smote based on cfda's spontaneous reporting data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8418931/ https://www.ncbi.nlm.nih.gov/pubmed/34493954 http://dx.doi.org/10.1155/2021/6033860 |
work_keys_str_mv | AT weijianxiang evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata AT fengguanzhong evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata AT luzhiqiang evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata AT hanpu evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata AT zhuyunxia evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata AT huangweidong evaluatingdrugriskusingganandsmotebasedoncfdasspontaneousreportingdata |