Cargando…

Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm

The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting un...

Descripción completa

Detalles Bibliográficos
Autores principales: Brzezińska, Agnieszka Nowak, Horyń, Czesław
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Author(s). Published by Elsevier B.V. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486228/
https://www.ncbi.nlm.nih.gov/pubmed/34630747
http://dx.doi.org/10.1016/j.procs.2021.09.073
_version_ 1784577704257912832
author Brzezińska, Agnieszka Nowak
Horyń, Czesław
author_facet Brzezińska, Agnieszka Nowak
Horyń, Czesław
author_sort Brzezińska, Agnieszka Nowak
collection PubMed
description The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system.
format Online
Article
Text
id pubmed-8486228
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Author(s). Published by Elsevier B.V.
record_format MEDLINE/PubMed
spelling pubmed-84862282021-10-04 Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm Brzezińska, Agnieszka Nowak Horyń, Czesław Procedia Comput Sci Article The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system. The Author(s). Published by Elsevier B.V. 2021 2021-10-01 /pmc/articles/PMC8486228/ /pubmed/34630747 http://dx.doi.org/10.1016/j.procs.2021.09.073 Text en © 2021 The Author(s) Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Brzezińska, Agnieszka Nowak
Horyń, Czesław
Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title_full Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title_fullStr Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title_full_unstemmed Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title_short Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
title_sort outliers in covid 19 data based on rule representation - the analysis of lof algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486228/
https://www.ncbi.nlm.nih.gov/pubmed/34630747
http://dx.doi.org/10.1016/j.procs.2021.09.073
work_keys_str_mv AT brzezinskaagnieszkanowak outliersincovid19databasedonrulerepresentationtheanalysisoflofalgorithm
AT horynczesław outliersincovid19databasedonrulerepresentationtheanalysisoflofalgorithm