Cargando…
The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instance...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7657693/ https://www.ncbi.nlm.nih.gov/pubmed/33204304 http://dx.doi.org/10.1155/2020/8824625 |
_version_ | 1783608549757157376 |
---|---|
author | Al-Shamaa, Zina Z. R. Kurnaz, Sefer Duru, Adil Deniz Peppa, Nadia Mirnezami, Alex H. Hamady, Zaed Z. R. |
author_facet | Al-Shamaa, Zina Z. R. Kurnaz, Sefer Duru, Adil Deniz Peppa, Nadia Mirnezami, Alex H. Hamady, Zaed Z. R. |
author_sort | Al-Shamaa, Zina Z. R. |
collection | PubMed |
description | Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy. |
format | Online Article Text |
id | pubmed-7657693 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-76576932020-11-16 The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets Al-Shamaa, Zina Z. R. Kurnaz, Sefer Duru, Adil Deniz Peppa, Nadia Mirnezami, Alex H. Hamady, Zaed Z. R. Appl Bionics Biomech Research Article Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy. Hindawi 2020-11-04 /pmc/articles/PMC7657693/ /pubmed/33204304 http://dx.doi.org/10.1155/2020/8824625 Text en Copyright © 2020 Zina Z. R. Al-Shamaa et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Al-Shamaa, Zina Z. R. Kurnaz, Sefer Duru, Adil Deniz Peppa, Nadia Mirnezami, Alex H. Hamady, Zaed Z. R. The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title_full | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title_fullStr | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title_full_unstemmed | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title_short | The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets |
title_sort | use of hellinger distance undersampling model to improve the classification of disease class in imbalanced medical datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7657693/ https://www.ncbi.nlm.nih.gov/pubmed/33204304 http://dx.doi.org/10.1155/2020/8824625 |
work_keys_str_mv | AT alshamaazinazr theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT kurnazsefer theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT duruadildeniz theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT peppanadia theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT mirnezamialexh theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT hamadyzaedzr theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT alshamaazinazr useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT kurnazsefer useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT duruadildeniz useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT peppanadia useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT mirnezamialexh useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets AT hamadyzaedzr useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets |