Cargando…

The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets

Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instance...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-Shamaa, Zina Z. R., Kurnaz, Sefer, Duru, Adil Deniz, Peppa, Nadia, Mirnezami, Alex H., Hamady, Zaed Z. R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7657693/
https://www.ncbi.nlm.nih.gov/pubmed/33204304
http://dx.doi.org/10.1155/2020/8824625
_version_ 1783608549757157376
author Al-Shamaa, Zina Z. R.
Kurnaz, Sefer
Duru, Adil Deniz
Peppa, Nadia
Mirnezami, Alex H.
Hamady, Zaed Z. R.
author_facet Al-Shamaa, Zina Z. R.
Kurnaz, Sefer
Duru, Adil Deniz
Peppa, Nadia
Mirnezami, Alex H.
Hamady, Zaed Z. R.
author_sort Al-Shamaa, Zina Z. R.
collection PubMed
description Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy.
format Online
Article
Text
id pubmed-7657693
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-76576932020-11-16 The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets Al-Shamaa, Zina Z. R. Kurnaz, Sefer Duru, Adil Deniz Peppa, Nadia Mirnezami, Alex H. Hamady, Zaed Z. R. Appl Bionics Biomech Research Article Imbalanced class distribution in the medical dataset is a challenging task that hinders classifying disease correctly. It emerges when the number of healthy class instances being much larger than the disease class instances. To solve this problem, we proposed undersampling the healthy class instances to improve disease class classification. This model is named Hellinger Distance Undersampling (HDUS). It employs the Hellinger Distance to measure the resemblance between majority class instance and its neighbouring minority class instances to separate classes effectively and boost the discrimination power for each class. An extensive experiment has been conducted on four imbalanced medical datasets using three classifiers to compare HDUS with a baseline model and three state-of-the-art undersampling models. The outcomes display that HDUS can perform better than other models in terms of sensitivity, F1 measure, and balanced accuracy. Hindawi 2020-11-04 /pmc/articles/PMC7657693/ /pubmed/33204304 http://dx.doi.org/10.1155/2020/8824625 Text en Copyright © 2020 Zina Z. R. Al-Shamaa et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Al-Shamaa, Zina Z. R.
Kurnaz, Sefer
Duru, Adil Deniz
Peppa, Nadia
Mirnezami, Alex H.
Hamady, Zaed Z. R.
The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_full The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_fullStr The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_full_unstemmed The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_short The Use of Hellinger Distance Undersampling Model to Improve the Classification of Disease Class in Imbalanced Medical Datasets
title_sort use of hellinger distance undersampling model to improve the classification of disease class in imbalanced medical datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7657693/
https://www.ncbi.nlm.nih.gov/pubmed/33204304
http://dx.doi.org/10.1155/2020/8824625
work_keys_str_mv AT alshamaazinazr theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT kurnazsefer theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT duruadildeniz theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT peppanadia theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT mirnezamialexh theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT hamadyzaedzr theuseofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT alshamaazinazr useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT kurnazsefer useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT duruadildeniz useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT peppanadia useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT mirnezamialexh useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets
AT hamadyzaedzr useofhellingerdistanceundersamplingmodeltoimprovetheclassificationofdiseaseclassinimbalancedmedicaldatasets