Cargando…

A new framework based on features modeling and ensemble learning to predict query performance

A query optimizer attempts to predict a performance metric based on the amount of time elapsed. Theoretically, this would necessitate the creation of a significant overhead on the core engine to provide the necessary query optimizing statistics. Machine learning is increasingly being used to improve...

Descripción completa

Detalles Bibliográficos
Autores principales: Zaghloul, Mohamed, Salem, Mofreh, Ali-Eldin, Amr
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523072/
https://www.ncbi.nlm.nih.gov/pubmed/34662344
http://dx.doi.org/10.1371/journal.pone.0258439
_version_ 1784585218448949248
author Zaghloul, Mohamed
Salem, Mofreh
Ali-Eldin, Amr
author_facet Zaghloul, Mohamed
Salem, Mofreh
Ali-Eldin, Amr
author_sort Zaghloul, Mohamed
collection PubMed
description A query optimizer attempts to predict a performance metric based on the amount of time elapsed. Theoretically, this would necessitate the creation of a significant overhead on the core engine to provide the necessary query optimizing statistics. Machine learning is increasingly being used to improve query performance by incorporating regression models. To predict the response time for a query, most query performance approaches rely on DBMS optimizing statistics and the cost estimation of each operator in the query execution plan, which also focuses on resource utilization (CPU, I/O). Modeling query features is thus a critical step in developing a robust query performance prediction model. In this paper, we propose a new framework based on query feature modeling and ensemble learning to predict query performance and use this framework as a query performance predictor simulator to optimize the query features that influence query performance. In query feature modeling, we propose five dimensions used to model query features. The query features dimensions are syntax, hardware, software, data architecture, and historical performance logs. These features will be based on developing training datasets for the performance prediction model that employs the ensemble learning model. As a result, ensemble learning leverages the query performance prediction problem to deal with missing values. Handling overfitting via regularization. The section on experimental work will go over how to use the proposed framework in experimental work. The training dataset in this paper is made up of performance data logs from various real-world environments. The outcomes were compared to show the difference between the actual and expected performance of the proposed prediction model. Empirical work shows the effectiveness of the proposed approach compared to related work.
format Online
Article
Text
id pubmed-8523072
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-85230722021-10-19 A new framework based on features modeling and ensemble learning to predict query performance Zaghloul, Mohamed Salem, Mofreh Ali-Eldin, Amr PLoS One Research Article A query optimizer attempts to predict a performance metric based on the amount of time elapsed. Theoretically, this would necessitate the creation of a significant overhead on the core engine to provide the necessary query optimizing statistics. Machine learning is increasingly being used to improve query performance by incorporating regression models. To predict the response time for a query, most query performance approaches rely on DBMS optimizing statistics and the cost estimation of each operator in the query execution plan, which also focuses on resource utilization (CPU, I/O). Modeling query features is thus a critical step in developing a robust query performance prediction model. In this paper, we propose a new framework based on query feature modeling and ensemble learning to predict query performance and use this framework as a query performance predictor simulator to optimize the query features that influence query performance. In query feature modeling, we propose five dimensions used to model query features. The query features dimensions are syntax, hardware, software, data architecture, and historical performance logs. These features will be based on developing training datasets for the performance prediction model that employs the ensemble learning model. As a result, ensemble learning leverages the query performance prediction problem to deal with missing values. Handling overfitting via regularization. The section on experimental work will go over how to use the proposed framework in experimental work. The training dataset in this paper is made up of performance data logs from various real-world environments. The outcomes were compared to show the difference between the actual and expected performance of the proposed prediction model. Empirical work shows the effectiveness of the proposed approach compared to related work. Public Library of Science 2021-10-18 /pmc/articles/PMC8523072/ /pubmed/34662344 http://dx.doi.org/10.1371/journal.pone.0258439 Text en © 2021 Zaghloul et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zaghloul, Mohamed
Salem, Mofreh
Ali-Eldin, Amr
A new framework based on features modeling and ensemble learning to predict query performance
title A new framework based on features modeling and ensemble learning to predict query performance
title_full A new framework based on features modeling and ensemble learning to predict query performance
title_fullStr A new framework based on features modeling and ensemble learning to predict query performance
title_full_unstemmed A new framework based on features modeling and ensemble learning to predict query performance
title_short A new framework based on features modeling and ensemble learning to predict query performance
title_sort new framework based on features modeling and ensemble learning to predict query performance
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523072/
https://www.ncbi.nlm.nih.gov/pubmed/34662344
http://dx.doi.org/10.1371/journal.pone.0258439
work_keys_str_mv AT zaghloulmohamed anewframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance
AT salemmofreh anewframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance
AT alieldinamr anewframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance
AT zaghloulmohamed newframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance
AT salemmofreh newframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance
AT alieldinamr newframeworkbasedonfeaturesmodelingandensemblelearningtopredictqueryperformance