Cargando…

SCARF: a biomedical association rule finding webserver

The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plent...

Descripción completa

Detalles Bibliográficos
Autores principales:	Szalkai, Balázs, Grolmusz, Vince
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	De Gruyter 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9135138/ https://www.ncbi.nlm.nih.gov/pubmed/35119233 http://dx.doi.org/10.1515/jib-2021-0035

_version_	1784713898834788352
author	Szalkai, Balázs Grolmusz, Vince
author_facet	Szalkai, Balázs Grolmusz, Vince
author_sort	Szalkai, Balázs
collection	PubMed
description	The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying number of subjects with the required parameter ensembles. Specifically, finding combinatorial biomarkers for some given condition also needs a very large dataset to analyze. For fast and automatic multi-parametric relation discovery association-rule finding tools are used for more than two decades in the data-mining community. Here we present the SCARF webserver for generalized association rule mining. Association rules are of the form: a AND b AND … AND x → y, meaning that the presence of properties a AND b AND … AND x implies property y; our algorithm finds generalized association rules, since it also finds logical disjunctions (i.e., ORs) at the left-hand side, allowing the discovery of more complex rules in a more compressed form in the database. This feature also helps reducing the typically very large result-tables of such studies, since allowing ORs in the left-hand side of a single rule could include dozens of classical rules. The capabilities of the SCARF algorithm were demonstrated in mining the Alzheimer’s database of the Coalition Against Major Diseases (CAMD) in our recent publication (Archives of Gerontology and Geriatrics Vol. 73, pp. 300–307, 2017). Here we describe the webserver implementation of the algorithm.
format	Online Article Text
id	pubmed-9135138
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	De Gruyter
record_format	MEDLINE/PubMed
spelling	pubmed-91351382022-06-04 SCARF: a biomedical association rule finding webserver Szalkai, Balázs Grolmusz, Vince J Integr Bioinform Article The analysis of enormous datasets with missing data entries is a standard task in biological and medical data processing. Large-scale, multi-institution clinical studies are the typical examples of such datasets. These sets make possible the search for multi-parametric relations since from the plenty of the data one is likely to find a satisfying number of subjects with the required parameter ensembles. Specifically, finding combinatorial biomarkers for some given condition also needs a very large dataset to analyze. For fast and automatic multi-parametric relation discovery association-rule finding tools are used for more than two decades in the data-mining community. Here we present the SCARF webserver for generalized association rule mining. Association rules are of the form: a AND b AND … AND x → y, meaning that the presence of properties a AND b AND … AND x implies property y; our algorithm finds generalized association rules, since it also finds logical disjunctions (i.e., ORs) at the left-hand side, allowing the discovery of more complex rules in a more compressed form in the database. This feature also helps reducing the typically very large result-tables of such studies, since allowing ORs in the left-hand side of a single rule could include dozens of classical rules. The capabilities of the SCARF algorithm were demonstrated in mining the Alzheimer’s database of the Coalition Against Major Diseases (CAMD) in our recent publication (Archives of Gerontology and Geriatrics Vol. 73, pp. 300–307, 2017). Here we describe the webserver implementation of the algorithm. De Gruyter 2022-02-04 /pmc/articles/PMC9135138/ /pubmed/35119233 http://dx.doi.org/10.1515/jib-2021-0035 Text en © 2022 Balázs Szalkai and Vince Grolmusz published by De Gruyter, Berlin/Boston https://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 International License.
spellingShingle	Article Szalkai, Balázs Grolmusz, Vince SCARF: a biomedical association rule finding webserver
title	SCARF: a biomedical association rule finding webserver
title_full	SCARF: a biomedical association rule finding webserver
title_fullStr	SCARF: a biomedical association rule finding webserver
title_full_unstemmed	SCARF: a biomedical association rule finding webserver
title_short	SCARF: a biomedical association rule finding webserver
title_sort	scarf: a biomedical association rule finding webserver
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9135138/ https://www.ncbi.nlm.nih.gov/pubmed/35119233 http://dx.doi.org/10.1515/jib-2021-0035
work_keys_str_mv	AT szalkaibalazs scarfabiomedicalassociationrulefindingwebserver AT grolmuszvince scarfabiomedicalassociationrulefindingwebserver

SCARF: a biomedical association rule finding webserver

Ejemplares similares