Cargando…

Discretisation of conditions in decision rules induced for continuous data

Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different appro...

Descripción completa

Detalles Bibliográficos
Autores principales: Stańczyk, Urszula, Zielosko, Beata, Baron, Grzegorz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7176134/
https://www.ncbi.nlm.nih.gov/pubmed/32320407
http://dx.doi.org/10.1371/journal.pone.0231788
_version_ 1783524960116932608
author Stańczyk, Urszula
Zielosko, Beata
Baron, Grzegorz
author_facet Stańczyk, Urszula
Zielosko, Beata
Baron, Grzegorz
author_sort Stańczyk, Urszula
collection PubMed
description Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs.
format Online
Article
Text
id pubmed-7176134
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-71761342020-05-12 Discretisation of conditions in decision rules induced for continuous data Stańczyk, Urszula Zielosko, Beata Baron, Grzegorz PLoS One Research Article Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs. Public Library of Science 2020-04-22 /pmc/articles/PMC7176134/ /pubmed/32320407 http://dx.doi.org/10.1371/journal.pone.0231788 Text en © 2020 Stańczyk et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Stańczyk, Urszula
Zielosko, Beata
Baron, Grzegorz
Discretisation of conditions in decision rules induced for continuous data
title Discretisation of conditions in decision rules induced for continuous data
title_full Discretisation of conditions in decision rules induced for continuous data
title_fullStr Discretisation of conditions in decision rules induced for continuous data
title_full_unstemmed Discretisation of conditions in decision rules induced for continuous data
title_short Discretisation of conditions in decision rules induced for continuous data
title_sort discretisation of conditions in decision rules induced for continuous data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7176134/
https://www.ncbi.nlm.nih.gov/pubmed/32320407
http://dx.doi.org/10.1371/journal.pone.0231788
work_keys_str_mv AT stanczykurszula discretisationofconditionsindecisionrulesinducedforcontinuousdata
AT zieloskobeata discretisationofconditionsindecisionrulesinducedforcontinuousdata
AT barongrzegorz discretisationofconditionsindecisionrulesinducedforcontinuousdata