Cargando…
Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data
Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3725769/ https://www.ncbi.nlm.nih.gov/pubmed/23935439 http://dx.doi.org/10.1155/2013/875450 |
_version_ | 1782278581349515264 |
---|---|
author | Li, Fengqi Yu, Chuang Yang, Nanhai Xia, Feng Li, Guangming Kaveh-Yazdy, Fatemeh |
author_facet | Li, Fengqi Yu, Chuang Yang, Nanhai Xia, Feng Li, Guangming Kaveh-Yazdy, Fatemeh |
author_sort | Li, Fengqi |
collection | PubMed |
description | Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution which happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classification. In this paper, we proposed a simple and effective approach to alleviate the unfavorable influence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets and MNIST handwritten digits dataset showed that the proposed approach outperforms other existing state-of-art methods. |
format | Online Article Text |
id | pubmed-3725769 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-37257692013-08-09 Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data Li, Fengqi Yu, Chuang Yang, Nanhai Xia, Feng Li, Guangming Kaveh-Yazdy, Fatemeh ScientificWorldJournal Research Article Transductive graph-based semisupervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their edges in order to get the predicted labels of unlabeled samples. Most popular semi-supervised learning approaches are sensitive to initial label distribution which happened in imbalanced labeled datasets. The class boundary will be severely skewed by the majority classes in an imbalanced classification. In this paper, we proposed a simple and effective approach to alleviate the unfavorable influence of imbalance problem by iteratively selecting a few unlabeled samples and adding them into the minority classes to form a balanced labeled dataset for the learning methods afterwards. The experiments on UCI datasets and MNIST handwritten digits dataset showed that the proposed approach outperforms other existing state-of-art methods. Hindawi Publishing Corporation 2013-07-10 /pmc/articles/PMC3725769/ /pubmed/23935439 http://dx.doi.org/10.1155/2013/875450 Text en Copyright © 2013 Fengqi Li et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Li, Fengqi Yu, Chuang Yang, Nanhai Xia, Feng Li, Guangming Kaveh-Yazdy, Fatemeh Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title_full | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title_fullStr | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title_full_unstemmed | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title_short | Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data |
title_sort | iterative nearest neighborhood oversampling in semisupervised learning from imbalanced data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3725769/ https://www.ncbi.nlm.nih.gov/pubmed/23935439 http://dx.doi.org/10.1155/2013/875450 |
work_keys_str_mv | AT lifengqi iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT yuchuang iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT yangnanhai iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT xiafeng iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT liguangming iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata AT kavehyazdyfatemeh iterativenearestneighborhoodoversamplinginsemisupervisedlearningfromimbalanceddata |