Cargando…

Provenance-and machine learning-based recommendation of parameter values in scientific workflows

Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execu...

Descripción completa

Detalles Bibliográficos
Autores principales: Silva Junior, Daniel, Pacitti, Esther, Paes, Aline, de Oliveira, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279147/
https://www.ncbi.nlm.nih.gov/pubmed/34307859
http://dx.doi.org/10.7717/peerj-cs.606
_version_ 1783722397274210304
author Silva Junior, Daniel
Pacitti, Esther
Paes, Aline
de Oliveira, Daniel
author_facet Silva Junior, Daniel
Pacitti, Esther
Paes, Aline
de Oliveira, Daniel
author_sort Silva Junior, Daniel
collection PubMed
description Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execution. In some cases, they also provide components to ease the visualization and analysis of the generated data. During the workflow’s composition phase, programs must be selected to perform the activities defined in the workflow specification. These programs often require additional parameters that serve to adjust the program’s behavior according to the experiment’s goals. Consequently, workflows commonly have many parameters to be manually configured, encompassing even more than one hundred in many cases. Wrongly parameters’ values choosing can lead to crash workflows executions or provide undesired results. As the execution of data- and compute-intensive workflows is commonly performed in a high-performance computing environment e.g., (a cluster, a supercomputer, or a public cloud), an unsuccessful execution configures a waste of time and resources. In this article, we present FReeP—Feature Recommender from Preferences, a parameter value recommendation method that is designed to suggest values for workflow parameters, taking into account past user preferences. FReeP is based on Machine Learning techniques, particularly in Preference Learning. FReeP is composed of three algorithms, where two of them aim at recommending the value for one parameter at a time, and the third makes recommendations for n parameters at once. The experimental results obtained with provenance data from two broadly used workflows showed FReeP usefulness in the recommendation of values for one parameter. Furthermore, the results indicate the potential of FReeP to recommend values for n parameters in scientific workflows.
format Online
Article
Text
id pubmed-8279147
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-82791472021-07-22 Provenance-and machine learning-based recommendation of parameter values in scientific workflows Silva Junior, Daniel Pacitti, Esther Paes, Aline de Oliveira, Daniel PeerJ Comput Sci Data Mining and Machine Learning Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execution. In some cases, they also provide components to ease the visualization and analysis of the generated data. During the workflow’s composition phase, programs must be selected to perform the activities defined in the workflow specification. These programs often require additional parameters that serve to adjust the program’s behavior according to the experiment’s goals. Consequently, workflows commonly have many parameters to be manually configured, encompassing even more than one hundred in many cases. Wrongly parameters’ values choosing can lead to crash workflows executions or provide undesired results. As the execution of data- and compute-intensive workflows is commonly performed in a high-performance computing environment e.g., (a cluster, a supercomputer, or a public cloud), an unsuccessful execution configures a waste of time and resources. In this article, we present FReeP—Feature Recommender from Preferences, a parameter value recommendation method that is designed to suggest values for workflow parameters, taking into account past user preferences. FReeP is based on Machine Learning techniques, particularly in Preference Learning. FReeP is composed of three algorithms, where two of them aim at recommending the value for one parameter at a time, and the third makes recommendations for n parameters at once. The experimental results obtained with provenance data from two broadly used workflows showed FReeP usefulness in the recommendation of values for one parameter. Furthermore, the results indicate the potential of FReeP to recommend values for n parameters in scientific workflows. PeerJ Inc. 2021-07-05 /pmc/articles/PMC8279147/ /pubmed/34307859 http://dx.doi.org/10.7717/peerj-cs.606 Text en © 2021 Silva Junior et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Data Mining and Machine Learning
Silva Junior, Daniel
Pacitti, Esther
Paes, Aline
de Oliveira, Daniel
Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title_full Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title_fullStr Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title_full_unstemmed Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title_short Provenance-and machine learning-based recommendation of parameter values in scientific workflows
title_sort provenance-and machine learning-based recommendation of parameter values in scientific workflows
topic Data Mining and Machine Learning
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279147/
https://www.ncbi.nlm.nih.gov/pubmed/34307859
http://dx.doi.org/10.7717/peerj-cs.606
work_keys_str_mv AT silvajuniordaniel provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows
AT pacittiesther provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows
AT paesaline provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows
AT deoliveiradaniel provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows