Cargando…

Automatic configuration of the Cassandra database using irace

Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to...

Descripción completa

Detalles Bibliográficos
Autores principales: Silva-Muñoz, Moisés, Franzin, Alberto, Bersini, Hugues
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356662/
https://www.ncbi.nlm.nih.gov/pubmed/34435094
http://dx.doi.org/10.7717/peerj-cs.634
_version_ 1783736989225320448
author Silva-Muñoz, Moisés
Franzin, Alberto
Bersini, Hugues
author_facet Silva-Muñoz, Moisés
Franzin, Alberto
Bersini, Hugues
author_sort Silva-Muñoz, Moisés
collection PubMed
description Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained.
format Online
Article
Text
id pubmed-8356662
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-83566622021-08-24 Automatic configuration of the Cassandra database using irace Silva-Muñoz, Moisés Franzin, Alberto Bersini, Hugues PeerJ Comput Sci Artificial Intelligence Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained. PeerJ Inc. 2021-08-05 /pmc/articles/PMC8356662/ /pubmed/34435094 http://dx.doi.org/10.7717/peerj-cs.634 Text en © 2021 Silva-Muñoz et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Silva-Muñoz, Moisés
Franzin, Alberto
Bersini, Hugues
Automatic configuration of the Cassandra database using irace
title Automatic configuration of the Cassandra database using irace
title_full Automatic configuration of the Cassandra database using irace
title_fullStr Automatic configuration of the Cassandra database using irace
title_full_unstemmed Automatic configuration of the Cassandra database using irace
title_short Automatic configuration of the Cassandra database using irace
title_sort automatic configuration of the cassandra database using irace
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356662/
https://www.ncbi.nlm.nih.gov/pubmed/34435094
http://dx.doi.org/10.7717/peerj-cs.634
work_keys_str_mv AT silvamunozmoises automaticconfigurationofthecassandradatabaseusingirace
AT franzinalberto automaticconfigurationofthecassandradatabaseusingirace
AT bersinihugues automaticconfigurationofthecassandradatabaseusingirace