Cargando…

Medical Specialty Classification Based on Semiadversarial Data Augmentation

Rapidly increasing adoption of electronic health record (EHR) systems has caused automated medical specialty classification to become an important research field. Medical specialty classification not only improves EHR system retrieval efficiency and helps general practitioners identify urgent patien...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Huan, Zhu, Dong, Tan, Hao, Shafiq, Muhammad, Gu, Zhaoquan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10597728/
https://www.ncbi.nlm.nih.gov/pubmed/37881209
http://dx.doi.org/10.1155/2023/4919371
_version_ 1785125406102257664
author Zhang, Huan
Zhu, Dong
Tan, Hao
Shafiq, Muhammad
Gu, Zhaoquan
author_facet Zhang, Huan
Zhu, Dong
Tan, Hao
Shafiq, Muhammad
Gu, Zhaoquan
author_sort Zhang, Huan
collection PubMed
description Rapidly increasing adoption of electronic health record (EHR) systems has caused automated medical specialty classification to become an important research field. Medical specialty classification not only improves EHR system retrieval efficiency and helps general practitioners identify urgent patient issues but also is useful in studying the practice and validity of clinical referral patterns. However, currently available medical note data are imbalanced and insufficient. In addition, medical specialty classification is a multicategory problem, and it is not easy to remove sensitive information from numerous medical notes and tag them. To solve those problems, we propose a data augmentation method based on adversarial attacks. The semiadversarial examples generated during the dynamic process of adversarial attacking are added to the training set as augmented examples, which can effectively expand the coverage of the training data on the decision space. Besides, as nouns in medical notes are critical information, we design a classification framework incorporating probabilistic information of nouns, with confidence recalculation after the softmax layer. We validate our proposed method on an 18-class dataset with extremely unbalanced data, and comparison experiments with four benchmarks show that our method improves accuracy and F1 score to the optimal level, by an average of 14.9%.
format Online
Article
Text
id pubmed-10597728
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-105977282023-10-25 Medical Specialty Classification Based on Semiadversarial Data Augmentation Zhang, Huan Zhu, Dong Tan, Hao Shafiq, Muhammad Gu, Zhaoquan Comput Intell Neurosci Research Article Rapidly increasing adoption of electronic health record (EHR) systems has caused automated medical specialty classification to become an important research field. Medical specialty classification not only improves EHR system retrieval efficiency and helps general practitioners identify urgent patient issues but also is useful in studying the practice and validity of clinical referral patterns. However, currently available medical note data are imbalanced and insufficient. In addition, medical specialty classification is a multicategory problem, and it is not easy to remove sensitive information from numerous medical notes and tag them. To solve those problems, we propose a data augmentation method based on adversarial attacks. The semiadversarial examples generated during the dynamic process of adversarial attacking are added to the training set as augmented examples, which can effectively expand the coverage of the training data on the decision space. Besides, as nouns in medical notes are critical information, we design a classification framework incorporating probabilistic information of nouns, with confidence recalculation after the softmax layer. We validate our proposed method on an 18-class dataset with extremely unbalanced data, and comparison experiments with four benchmarks show that our method improves accuracy and F1 score to the optimal level, by an average of 14.9%. Hindawi 2023-10-17 /pmc/articles/PMC10597728/ /pubmed/37881209 http://dx.doi.org/10.1155/2023/4919371 Text en Copyright © 2023 Huan Zhang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Huan
Zhu, Dong
Tan, Hao
Shafiq, Muhammad
Gu, Zhaoquan
Medical Specialty Classification Based on Semiadversarial Data Augmentation
title Medical Specialty Classification Based on Semiadversarial Data Augmentation
title_full Medical Specialty Classification Based on Semiadversarial Data Augmentation
title_fullStr Medical Specialty Classification Based on Semiadversarial Data Augmentation
title_full_unstemmed Medical Specialty Classification Based on Semiadversarial Data Augmentation
title_short Medical Specialty Classification Based on Semiadversarial Data Augmentation
title_sort medical specialty classification based on semiadversarial data augmentation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10597728/
https://www.ncbi.nlm.nih.gov/pubmed/37881209
http://dx.doi.org/10.1155/2023/4919371
work_keys_str_mv AT zhanghuan medicalspecialtyclassificationbasedonsemiadversarialdataaugmentation
AT zhudong medicalspecialtyclassificationbasedonsemiadversarialdataaugmentation
AT tanhao medicalspecialtyclassificationbasedonsemiadversarialdataaugmentation
AT shafiqmuhammad medicalspecialtyclassificationbasedonsemiadversarialdataaugmentation
AT guzhaoquan medicalspecialtyclassificationbasedonsemiadversarialdataaugmentation