Cargando…

Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering

Outliers are data points that significantly deviate from other data points in a data set because of different mechanisms or unusual processes. Outlier detection is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition...

Descripción completa

Detalles Bibliográficos
Autores principales: Cebeci, Zeynel, Cebeci, Cagatay, Tahtali, Yalcin, Bayyurt, Lutfi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575855/
https://www.ncbi.nlm.nih.gov/pubmed/36262121
http://dx.doi.org/10.7717/peerj-cs.1060
_version_ 1784811403747524608
author Cebeci, Zeynel
Cebeci, Cagatay
Tahtali, Yalcin
Bayyurt, Lutfi
author_facet Cebeci, Zeynel
Cebeci, Cagatay
Tahtali, Yalcin
Bayyurt, Lutfi
author_sort Cebeci, Zeynel
collection PubMed
description Outliers are data points that significantly deviate from other data points in a data set because of different mechanisms or unusual processes. Outlier detection is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for data cleansing in data science. In this study, we propose two novel outlier detection approaches using the typicality degrees which are the partitioning result of unsupervised possibilistic clustering algorithms. The proposed approaches are based on finding the atypical data points below a predefined threshold value, a possibilistic level for evaluating a point as an outlier. The experiments on the synthetic and real data sets showed that the proposed approaches can be successfully used to detect outliers without considering the structure and distribution of the features in multidimensional data sets.
format Online
Article
Text
id pubmed-9575855
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-95758552022-10-18 Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering Cebeci, Zeynel Cebeci, Cagatay Tahtali, Yalcin Bayyurt, Lutfi PeerJ Comput Sci Bioinformatics Outliers are data points that significantly deviate from other data points in a data set because of different mechanisms or unusual processes. Outlier detection is one of the intensively studied research topics for identification of novelties, frauds, anomalies, deviations or exceptions in addition to its use for data cleansing in data science. In this study, we propose two novel outlier detection approaches using the typicality degrees which are the partitioning result of unsupervised possibilistic clustering algorithms. The proposed approaches are based on finding the atypical data points below a predefined threshold value, a possibilistic level for evaluating a point as an outlier. The experiments on the synthetic and real data sets showed that the proposed approaches can be successfully used to detect outliers without considering the structure and distribution of the features in multidimensional data sets. PeerJ Inc. 2022-09-27 /pmc/articles/PMC9575855/ /pubmed/36262121 http://dx.doi.org/10.7717/peerj-cs.1060 Text en ©2022 Cebeci et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Cebeci, Zeynel
Cebeci, Cagatay
Tahtali, Yalcin
Bayyurt, Lutfi
Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title_full Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title_fullStr Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title_full_unstemmed Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title_short Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
title_sort two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575855/
https://www.ncbi.nlm.nih.gov/pubmed/36262121
http://dx.doi.org/10.7717/peerj-cs.1060
work_keys_str_mv AT cebecizeynel twonoveloutlierdetectionapproachesbasedonunsupervisedpossibilisticandfuzzyclustering
AT cebecicagatay twonoveloutlierdetectionapproachesbasedonunsupervisedpossibilisticandfuzzyclustering
AT tahtaliyalcin twonoveloutlierdetectionapproachesbasedonunsupervisedpossibilisticandfuzzyclustering
AT bayyurtlutfi twonoveloutlierdetectionapproachesbasedonunsupervisedpossibilisticandfuzzyclustering