Cargando…
Discretisation of conditions in decision rules induced for continuous data
Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different appro...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7176134/ https://www.ncbi.nlm.nih.gov/pubmed/32320407 http://dx.doi.org/10.1371/journal.pone.0231788 |
_version_ | 1783524960116932608 |
---|---|
author | Stańczyk, Urszula Zielosko, Beata Baron, Grzegorz |
author_facet | Stańczyk, Urszula Zielosko, Beata Baron, Grzegorz |
author_sort | Stańczyk, Urszula |
collection | PubMed |
description | Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs. |
format | Online Article Text |
id | pubmed-7176134 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-71761342020-05-12 Discretisation of conditions in decision rules induced for continuous data Stańczyk, Urszula Zielosko, Beata Baron, Grzegorz PLoS One Research Article Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs. Public Library of Science 2020-04-22 /pmc/articles/PMC7176134/ /pubmed/32320407 http://dx.doi.org/10.1371/journal.pone.0231788 Text en © 2020 Stańczyk et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Stańczyk, Urszula Zielosko, Beata Baron, Grzegorz Discretisation of conditions in decision rules induced for continuous data |
title | Discretisation of conditions in decision rules induced for continuous data |
title_full | Discretisation of conditions in decision rules induced for continuous data |
title_fullStr | Discretisation of conditions in decision rules induced for continuous data |
title_full_unstemmed | Discretisation of conditions in decision rules induced for continuous data |
title_short | Discretisation of conditions in decision rules induced for continuous data |
title_sort | discretisation of conditions in decision rules induced for continuous data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7176134/ https://www.ncbi.nlm.nih.gov/pubmed/32320407 http://dx.doi.org/10.1371/journal.pone.0231788 |
work_keys_str_mv | AT stanczykurszula discretisationofconditionsindecisionrulesinducedforcontinuousdata AT zieloskobeata discretisationofconditionsindecisionrulesinducedforcontinuousdata AT barongrzegorz discretisationofconditionsindecisionrulesinducedforcontinuousdata |