Cargando…

Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification

The number of sensing data are often imbalanced across data classes, for which oversampling on the minority class is an effective remedy. In this paper, an effective oversampling method called evolutionary Mahalanobis distance oversampling (EMDO) is proposed for multi-class imbalanced data classific...

Descripción completa

Detalles Bibliográficos
Autores principales: Yao, Leehter, Lin, Tung-Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512012/
https://www.ncbi.nlm.nih.gov/pubmed/34640936
http://dx.doi.org/10.3390/s21196616
_version_ 1784582885949308928
author Yao, Leehter
Lin, Tung-Bin
author_facet Yao, Leehter
Lin, Tung-Bin
author_sort Yao, Leehter
collection PubMed
description The number of sensing data are often imbalanced across data classes, for which oversampling on the minority class is an effective remedy. In this paper, an effective oversampling method called evolutionary Mahalanobis distance oversampling (EMDO) is proposed for multi-class imbalanced data classification. EMDO utilizes a set of ellipsoids to approximate the decision regions of the minority class. Furthermore, multi-objective particle swarm optimization (MOPSO) is integrated with the Gustafson–Kessel algorithm in EMDO to learn the size, center, and orientation of every ellipsoid. Synthetic minority samples are generated based on Mahalanobis distance within every ellipsoid. The number of synthetic minority samples generated by EMDO in every ellipsoid is determined based on the density of minority samples in every ellipsoid. The results of computer simulations conducted herein indicate that EMDO outperforms most of the widely used oversampling schemes.
format Online
Article
Text
id pubmed-8512012
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85120122021-10-14 Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification Yao, Leehter Lin, Tung-Bin Sensors (Basel) Article The number of sensing data are often imbalanced across data classes, for which oversampling on the minority class is an effective remedy. In this paper, an effective oversampling method called evolutionary Mahalanobis distance oversampling (EMDO) is proposed for multi-class imbalanced data classification. EMDO utilizes a set of ellipsoids to approximate the decision regions of the minority class. Furthermore, multi-objective particle swarm optimization (MOPSO) is integrated with the Gustafson–Kessel algorithm in EMDO to learn the size, center, and orientation of every ellipsoid. Synthetic minority samples are generated based on Mahalanobis distance within every ellipsoid. The number of synthetic minority samples generated by EMDO in every ellipsoid is determined based on the density of minority samples in every ellipsoid. The results of computer simulations conducted herein indicate that EMDO outperforms most of the widely used oversampling schemes. MDPI 2021-10-04 /pmc/articles/PMC8512012/ /pubmed/34640936 http://dx.doi.org/10.3390/s21196616 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yao, Leehter
Lin, Tung-Bin
Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title_full Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title_fullStr Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title_full_unstemmed Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title_short Evolutionary Mahalanobis Distance-Based Oversampling for Multi-Class Imbalanced Data Classification
title_sort evolutionary mahalanobis distance-based oversampling for multi-class imbalanced data classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512012/
https://www.ncbi.nlm.nih.gov/pubmed/34640936
http://dx.doi.org/10.3390/s21196616
work_keys_str_mv AT yaoleehter evolutionarymahalanobisdistancebasedoversamplingformulticlassimbalanceddataclassification
AT lintungbin evolutionarymahalanobisdistancebasedoversamplingformulticlassimbalanceddataclassification