Cargando…
A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data
In k Nearest Neighbor (kNN) classifier, a query instance is classified based on the most frequent class of its nearest neighbors among the training instances. In imbalanced datasets, kNN becomes biased towards the majority instances of the training space. To solve this problem, we propose a method c...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206335/ http://dx.doi.org/10.1007/978-3-030-47436-2_6 |
_version_ | 1783530395059355648 |
---|---|
author | Kadir, Md. Eusha Akash, Pritom Saha Sharmin, Sadia Ali, Amin Ahsan Shoyaib, Mohammad |
author_facet | Kadir, Md. Eusha Akash, Pritom Saha Sharmin, Sadia Ali, Amin Ahsan Shoyaib, Mohammad |
author_sort | Kadir, Md. Eusha |
collection | PubMed |
description | In k Nearest Neighbor (kNN) classifier, a query instance is classified based on the most frequent class of its nearest neighbors among the training instances. In imbalanced datasets, kNN becomes biased towards the majority instances of the training space. To solve this problem, we propose a method called Proximity weighted Evidential kNN classifier. In this method, each neighbor of a query instance is considered as a piece of evidence from which we calculate the probability of class label given feature values to provide more preference to the minority instances. This is then discounted by the proximity of the neighbor to prioritize the closer instances in the local neighborhood. These evidences are then combined using Dempster-Shafer theory of evidence. A rigorous experiment over 30 benchmark imbalanced datasets shows that our method performs better compared to 12 popular methods. In pairwise comparison of these 12 methods with our method, in the best case, our method wins in 29 datasets, and in the worst case it wins in least 19 datasets. More importantly, according to Friedman test the proposed method ranks higher than all other methods in terms of AUC at 5% level of significance. |
format | Online Article Text |
id | pubmed-7206335 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72063352020-05-08 A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data Kadir, Md. Eusha Akash, Pritom Saha Sharmin, Sadia Ali, Amin Ahsan Shoyaib, Mohammad Advances in Knowledge Discovery and Data Mining Article In k Nearest Neighbor (kNN) classifier, a query instance is classified based on the most frequent class of its nearest neighbors among the training instances. In imbalanced datasets, kNN becomes biased towards the majority instances of the training space. To solve this problem, we propose a method called Proximity weighted Evidential kNN classifier. In this method, each neighbor of a query instance is considered as a piece of evidence from which we calculate the probability of class label given feature values to provide more preference to the minority instances. This is then discounted by the proximity of the neighbor to prioritize the closer instances in the local neighborhood. These evidences are then combined using Dempster-Shafer theory of evidence. A rigorous experiment over 30 benchmark imbalanced datasets shows that our method performs better compared to 12 popular methods. In pairwise comparison of these 12 methods with our method, in the best case, our method wins in 29 datasets, and in the worst case it wins in least 19 datasets. More importantly, according to Friedman test the proposed method ranks higher than all other methods in terms of AUC at 5% level of significance. 2020-04-17 /pmc/articles/PMC7206335/ http://dx.doi.org/10.1007/978-3-030-47436-2_6 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Kadir, Md. Eusha Akash, Pritom Saha Sharmin, Sadia Ali, Amin Ahsan Shoyaib, Mohammad A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title | A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title_full | A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title_fullStr | A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title_full_unstemmed | A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title_short | A Proximity Weighted Evidential k Nearest Neighbor Classifier for Imbalanced Data |
title_sort | proximity weighted evidential k nearest neighbor classifier for imbalanced data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206335/ http://dx.doi.org/10.1007/978-3-030-47436-2_6 |
work_keys_str_mv | AT kadirmdeusha aproximityweightedevidentialknearestneighborclassifierforimbalanceddata AT akashpritomsaha aproximityweightedevidentialknearestneighborclassifierforimbalanceddata AT sharminsadia aproximityweightedevidentialknearestneighborclassifierforimbalanceddata AT aliaminahsan aproximityweightedevidentialknearestneighborclassifierforimbalanceddata AT shoyaibmohammad aproximityweightedevidentialknearestneighborclassifierforimbalanceddata AT kadirmdeusha proximityweightedevidentialknearestneighborclassifierforimbalanceddata AT akashpritomsaha proximityweightedevidentialknearestneighborclassifierforimbalanceddata AT sharminsadia proximityweightedevidentialknearestneighborclassifierforimbalanceddata AT aliaminahsan proximityweightedevidentialknearestneighborclassifierforimbalanceddata AT shoyaibmohammad proximityweightedevidentialknearestneighborclassifierforimbalanceddata |