Cargando…
Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm
The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting un...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The Author(s). Published by Elsevier B.V.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486228/ https://www.ncbi.nlm.nih.gov/pubmed/34630747 http://dx.doi.org/10.1016/j.procs.2021.09.073 |
_version_ | 1784577704257912832 |
---|---|
author | Brzezińska, Agnieszka Nowak Horyń, Czesław |
author_facet | Brzezińska, Agnieszka Nowak Horyń, Czesław |
author_sort | Brzezińska, Agnieszka Nowak |
collection | PubMed |
description | The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system. |
format | Online Article Text |
id | pubmed-8486228 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | The Author(s). Published by Elsevier B.V. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84862282021-10-04 Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm Brzezińska, Agnieszka Nowak Horyń, Czesław Procedia Comput Sci Article The article concerns the detection of outliers in rule-based knowledge bases containing data on Covid 19 cases. The authors move from the automatic generation of a rule-based knowledge base from source data by clustering rules in the knowledge base to optimize inference processes and to detecting unusual rules allowing for the optimal structure of rule groups. The paper presents a two-phase procedure, wherein in the first phase, we look for the optimal structure of rule clusters when there are outlier rules in the knowledge base. In the second phase, we detect outliers in the rules using the LOF (Local Outlier Factor) algorithm. Then we eliminate the unusual rules from the database and check whether the selected cluster quality measures are responded positively to the elimination of outliers, which would indicate that the rules were rightly considered outliers. The performed experiments confirmed the effectiveness of the LOF algorithm and selected cluster quality measures in the context of detecting atypical rules. The detection of such rules can support knowledge engineers or domain experts in knowledge mining to improve the completeness of the knowledge base, which is usually the basis of the decision support system. The Author(s). Published by Elsevier B.V. 2021 2021-10-01 /pmc/articles/PMC8486228/ /pubmed/34630747 http://dx.doi.org/10.1016/j.procs.2021.09.073 Text en © 2021 The Author(s) Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Brzezińska, Agnieszka Nowak Horyń, Czesław Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title | Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title_full | Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title_fullStr | Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title_full_unstemmed | Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title_short | Outliers in Covid 19 data based on Rule representation - the analysis of LOF algorithm |
title_sort | outliers in covid 19 data based on rule representation - the analysis of lof algorithm |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486228/ https://www.ncbi.nlm.nih.gov/pubmed/34630747 http://dx.doi.org/10.1016/j.procs.2021.09.073 |
work_keys_str_mv | AT brzezinskaagnieszkanowak outliersincovid19databasedonrulerepresentationtheanalysisoflofalgorithm AT horynczesław outliersincovid19databasedonrulerepresentationtheanalysisoflofalgorithm |