Cargando…

Semi-supervised incremental learning with few examples for discovering medical association rules

BACKGROUND: Association Rules are one of the main ways to represent structural patterns underlying raw data. They represent dependencies between sets of observations contained in the data. The associations established by these rules are very useful in the medical domain, for example in the predictiv...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sánchez-de-Madariaga, Ricardo, Martinez-Romo, Juan, Escribano, José Miguel Cantero, Araujo, Lourdes
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8785547/ https://www.ncbi.nlm.nih.gov/pubmed/35073885 http://dx.doi.org/10.1186/s12911-022-01755-3

_version_	1784638985603121152
author	Sánchez-de-Madariaga, Ricardo Martinez-Romo, Juan Escribano, José Miguel Cantero Araujo, Lourdes
author_facet	Sánchez-de-Madariaga, Ricardo Martinez-Romo, Juan Escribano, José Miguel Cantero Araujo, Lourdes
author_sort	Sánchez-de-Madariaga, Ricardo
collection	PubMed
description	BACKGROUND: Association Rules are one of the main ways to represent structural patterns underlying raw data. They represent dependencies between sets of observations contained in the data. The associations established by these rules are very useful in the medical domain, for example in the predictive health field. Classic algorithms for association rule mining give rise to huge amounts of possible rules that should be filtered in order to select those most likely to be true. Most of the proposed techniques for these tasks are unsupervised. However, the accuracy provided by unsupervised systems is limited. Conversely, resorting to annotated data for training supervised systems is expensive and time-consuming. The purpose of this research is to design a new semi-supervised algorithm that performs like supervised algorithms but uses an affordable amount of training data. METHODS: In this work we propose a new semi-supervised data mining model that combines unsupervised techniques (Fisher’s exact test) with limited supervision. Starting with a small seed of annotated data, the model improves results (F-measure) obtained, using a fully supervised system (standard supervised ML algorithms). The idea is based on utilising the agreement between the predictions of the supervised system and those of the unsupervised techniques in a series of iterative steps. RESULTS: The new semi-supervised ML algorithm improves the results of supervised algorithms computed using the F-measure in the task of mining medical association rules, but training with an affordable amount of manually annotated data. CONCLUSIONS: Using a small amount of annotated data (which is easily achievable) leads to results similar to those of a supervised system. The proposal may be an important step for the practical development of techniques for mining association rules and generating new valuable scientific medical knowledge. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-022-01755-3.
format	Online Article Text
id	pubmed-8785547
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-87855472022-01-24 Semi-supervised incremental learning with few examples for discovering medical association rules Sánchez-de-Madariaga, Ricardo Martinez-Romo, Juan Escribano, José Miguel Cantero Araujo, Lourdes BMC Med Inform Decis Mak Research Article BACKGROUND: Association Rules are one of the main ways to represent structural patterns underlying raw data. They represent dependencies between sets of observations contained in the data. The associations established by these rules are very useful in the medical domain, for example in the predictive health field. Classic algorithms for association rule mining give rise to huge amounts of possible rules that should be filtered in order to select those most likely to be true. Most of the proposed techniques for these tasks are unsupervised. However, the accuracy provided by unsupervised systems is limited. Conversely, resorting to annotated data for training supervised systems is expensive and time-consuming. The purpose of this research is to design a new semi-supervised algorithm that performs like supervised algorithms but uses an affordable amount of training data. METHODS: In this work we propose a new semi-supervised data mining model that combines unsupervised techniques (Fisher’s exact test) with limited supervision. Starting with a small seed of annotated data, the model improves results (F-measure) obtained, using a fully supervised system (standard supervised ML algorithms). The idea is based on utilising the agreement between the predictions of the supervised system and those of the unsupervised techniques in a series of iterative steps. RESULTS: The new semi-supervised ML algorithm improves the results of supervised algorithms computed using the F-measure in the task of mining medical association rules, but training with an affordable amount of manually annotated data. CONCLUSIONS: Using a small amount of annotated data (which is easily achievable) leads to results similar to those of a supervised system. The proposal may be an important step for the practical development of techniques for mining association rules and generating new valuable scientific medical knowledge. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-022-01755-3. BioMed Central 2022-01-24 /pmc/articles/PMC8785547/ /pubmed/35073885 http://dx.doi.org/10.1186/s12911-022-01755-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Sánchez-de-Madariaga, Ricardo Martinez-Romo, Juan Escribano, José Miguel Cantero Araujo, Lourdes Semi-supervised incremental learning with few examples for discovering medical association rules
title	Semi-supervised incremental learning with few examples for discovering medical association rules
title_full	Semi-supervised incremental learning with few examples for discovering medical association rules
title_fullStr	Semi-supervised incremental learning with few examples for discovering medical association rules
title_full_unstemmed	Semi-supervised incremental learning with few examples for discovering medical association rules
title_short	Semi-supervised incremental learning with few examples for discovering medical association rules
title_sort	semi-supervised incremental learning with few examples for discovering medical association rules
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8785547/ https://www.ncbi.nlm.nih.gov/pubmed/35073885 http://dx.doi.org/10.1186/s12911-022-01755-3
work_keys_str_mv	AT sanchezdemadariagaricardo semisupervisedincrementallearningwithfewexamplesfordiscoveringmedicalassociationrules AT martinezromojuan semisupervisedincrementallearningwithfewexamplesfordiscoveringmedicalassociationrules AT escribanojosemiguelcantero semisupervisedincrementallearningwithfewexamplesfordiscoveringmedicalassociationrules AT araujolourdes semisupervisedincrementallearningwithfewexamplesfordiscoveringmedicalassociationrules

Semi-supervised incremental learning with few examples for discovering medical association rules

Ejemplares similares