Cargando…
Automatic configuration of the Cassandra database using irace
Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356662/ https://www.ncbi.nlm.nih.gov/pubmed/34435094 http://dx.doi.org/10.7717/peerj-cs.634 |
_version_ | 1783736989225320448 |
---|---|
author | Silva-Muñoz, Moisés Franzin, Alberto Bersini, Hugues |
author_facet | Silva-Muñoz, Moisés Franzin, Alberto Bersini, Hugues |
author_sort | Silva-Muñoz, Moisés |
collection | PubMed |
description | Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained. |
format | Online Article Text |
id | pubmed-8356662 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-83566622021-08-24 Automatic configuration of the Cassandra database using irace Silva-Muñoz, Moisés Franzin, Alberto Bersini, Hugues PeerJ Comput Sci Artificial Intelligence Database systems play a central role in modern data-centered applications. Their performance is thus a key factor in the efficiency of data processing pipelines. Modern database systems expose several parameters that users and database administrators can configure to tailor the database settings to the specific application considered. While this task has traditionally been performed manually, in the last years several methods have been proposed to automatically find the best parameter configuration for a database. Many of these methods, however, use statistical models that require high amounts of data and fail to represent all the factors that impact the performance of a database, or implement complex algorithmic solutions. In this work we study the potential of a simple model-free general-purpose configuration tool to automatically find the best parameter configuration of a database. We use the irace configurator to automatically find the best parameter configuration for the Cassandra NoSQL database using the YCBS benchmark under different scenarios. We establish a reliable experimental setup and obtain speedups of up to 30% over the default configuration in terms of throughput, and we provide an analysis of the configurations obtained. PeerJ Inc. 2021-08-05 /pmc/articles/PMC8356662/ /pubmed/34435094 http://dx.doi.org/10.7717/peerj-cs.634 Text en © 2021 Silva-Muñoz et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Artificial Intelligence Silva-Muñoz, Moisés Franzin, Alberto Bersini, Hugues Automatic configuration of the Cassandra database using irace |
title | Automatic configuration of the Cassandra database using irace |
title_full | Automatic configuration of the Cassandra database using irace |
title_fullStr | Automatic configuration of the Cassandra database using irace |
title_full_unstemmed | Automatic configuration of the Cassandra database using irace |
title_short | Automatic configuration of the Cassandra database using irace |
title_sort | automatic configuration of the cassandra database using irace |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356662/ https://www.ncbi.nlm.nih.gov/pubmed/34435094 http://dx.doi.org/10.7717/peerj-cs.634 |
work_keys_str_mv | AT silvamunozmoises automaticconfigurationofthecassandradatabaseusingirace AT franzinalberto automaticconfigurationofthecassandradatabaseusingirace AT bersinihugues automaticconfigurationofthecassandradatabaseusingirace |