Cargando…

K Important Neighbors: A Novel Approach to Binary Classification in High Dimensional Data

K nearest neighbors (KNN) are known as one of the simplest nonparametric classifiers but in high dimensional setting accuracy of KNN are affected by nuisance features. In this study, we proposed the K important neighbors (KIN) as a novel approach for binary classification in high dimensional problem...

Descripción completa

Detalles Bibliográficos
Autores principales:	Raeisi Shahraki, Hadi, Pourahmad, Saeedeh, Zare, Najaf
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5742505/ https://www.ncbi.nlm.nih.gov/pubmed/29376076 http://dx.doi.org/10.1155/2017/7560807

Descripción
Sumario:	K nearest neighbors (KNN) are known as one of the simplest nonparametric classifiers but in high dimensional setting accuracy of KNN are affected by nuisance features. In this study, we proposed the K important neighbors (KIN) as a novel approach for binary classification in high dimensional problems. To avoid the curse of dimensionality, we implemented smoothly clipped absolute deviation (SCAD) logistic regression at the initial stage and considered the importance of each feature in construction of dissimilarity measure with imposing features contribution as a function of SCAD coefficients on Euclidean distance. The nature of this hybrid dissimilarity measure, which combines information of both features and distances, enjoys all good properties of SCAD penalized regression and KNN simultaneously. In comparison to KNN, simulation studies showed that KIN has a good performance in terms of both accuracy and dimension reduction. The proposed approach was found to be capable of eliminating nearly all of the noninformative features because of utilizing oracle property of SCAD penalized regression in the construction of dissimilarity measure. In very sparse settings, KIN also outperforms support vector machine (SVM) and random forest (RF) as the best classifiers.

K Important Neighbors: A Novel Approach to Binary Classification in High Dimensional Data

Ejemplares similares