Cargando…
Provenance-and machine learning-based recommendation of parameter values in scientific workflows
Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execu...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279147/ https://www.ncbi.nlm.nih.gov/pubmed/34307859 http://dx.doi.org/10.7717/peerj-cs.606 |
_version_ | 1783722397274210304 |
---|---|
author | Silva Junior, Daniel Pacitti, Esther Paes, Aline de Oliveira, Daniel |
author_facet | Silva Junior, Daniel Pacitti, Esther Paes, Aline de Oliveira, Daniel |
author_sort | Silva Junior, Daniel |
collection | PubMed |
description | Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execution. In some cases, they also provide components to ease the visualization and analysis of the generated data. During the workflow’s composition phase, programs must be selected to perform the activities defined in the workflow specification. These programs often require additional parameters that serve to adjust the program’s behavior according to the experiment’s goals. Consequently, workflows commonly have many parameters to be manually configured, encompassing even more than one hundred in many cases. Wrongly parameters’ values choosing can lead to crash workflows executions or provide undesired results. As the execution of data- and compute-intensive workflows is commonly performed in a high-performance computing environment e.g., (a cluster, a supercomputer, or a public cloud), an unsuccessful execution configures a waste of time and resources. In this article, we present FReeP—Feature Recommender from Preferences, a parameter value recommendation method that is designed to suggest values for workflow parameters, taking into account past user preferences. FReeP is based on Machine Learning techniques, particularly in Preference Learning. FReeP is composed of three algorithms, where two of them aim at recommending the value for one parameter at a time, and the third makes recommendations for n parameters at once. The experimental results obtained with provenance data from two broadly used workflows showed FReeP usefulness in the recommendation of values for one parameter. Furthermore, the results indicate the potential of FReeP to recommend values for n parameters in scientific workflows. |
format | Online Article Text |
id | pubmed-8279147 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82791472021-07-22 Provenance-and machine learning-based recommendation of parameter values in scientific workflows Silva Junior, Daniel Pacitti, Esther Paes, Aline de Oliveira, Daniel PeerJ Comput Sci Data Mining and Machine Learning Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execution. In some cases, they also provide components to ease the visualization and analysis of the generated data. During the workflow’s composition phase, programs must be selected to perform the activities defined in the workflow specification. These programs often require additional parameters that serve to adjust the program’s behavior according to the experiment’s goals. Consequently, workflows commonly have many parameters to be manually configured, encompassing even more than one hundred in many cases. Wrongly parameters’ values choosing can lead to crash workflows executions or provide undesired results. As the execution of data- and compute-intensive workflows is commonly performed in a high-performance computing environment e.g., (a cluster, a supercomputer, or a public cloud), an unsuccessful execution configures a waste of time and resources. In this article, we present FReeP—Feature Recommender from Preferences, a parameter value recommendation method that is designed to suggest values for workflow parameters, taking into account past user preferences. FReeP is based on Machine Learning techniques, particularly in Preference Learning. FReeP is composed of three algorithms, where two of them aim at recommending the value for one parameter at a time, and the third makes recommendations for n parameters at once. The experimental results obtained with provenance data from two broadly used workflows showed FReeP usefulness in the recommendation of values for one parameter. Furthermore, the results indicate the potential of FReeP to recommend values for n parameters in scientific workflows. PeerJ Inc. 2021-07-05 /pmc/articles/PMC8279147/ /pubmed/34307859 http://dx.doi.org/10.7717/peerj-cs.606 Text en © 2021 Silva Junior et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Data Mining and Machine Learning Silva Junior, Daniel Pacitti, Esther Paes, Aline de Oliveira, Daniel Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title | Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title_full | Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title_fullStr | Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title_full_unstemmed | Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title_short | Provenance-and machine learning-based recommendation of parameter values in scientific workflows |
title_sort | provenance-and machine learning-based recommendation of parameter values in scientific workflows |
topic | Data Mining and Machine Learning |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8279147/ https://www.ncbi.nlm.nih.gov/pubmed/34307859 http://dx.doi.org/10.7717/peerj-cs.606 |
work_keys_str_mv | AT silvajuniordaniel provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows AT pacittiesther provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows AT paesaline provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows AT deoliveiradaniel provenanceandmachinelearningbasedrecommendationofparametervaluesinscientificworkflows |