Cargando…

A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method

Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each w...

Descripción completa

Detalles Bibliográficos
Autores principales: Azhir, Elham, Jafari Navimipour, Nima, Hosseinzadeh, Mehdi, Sharifi, Arash, Darwesh, Aso
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176525/
https://www.ncbi.nlm.nih.gov/pubmed/34141897
http://dx.doi.org/10.7717/peerj-cs.580
_version_ 1783703272540864512
author Azhir, Elham
Jafari Navimipour, Nima
Hosseinzadeh, Mehdi
Sharifi, Arash
Darwesh, Aso
author_facet Azhir, Elham
Jafari Navimipour, Nima
Hosseinzadeh, Mehdi
Sharifi, Arash
Darwesh, Aso
author_sort Azhir, Elham
collection PubMed
description Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each with a corresponding execution cost. To produce an effective query plan thus requires examining a large number of alternative plans. Access plan recommendation is an alternative technique to database query optimization, which reuses the previously-generated QEPs to execute new queries. In this technique, the query optimizer uses clustering methods to identify groups of similar queries. However, clustering such large datasets is challenging for traditional clustering algorithms due to huge processing time. Numerous cloud-based platforms have been introduced that offer low-cost solutions for the processing of distributed queries such as Hadoop, Hive, Pig, etc. This paper has applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce. The results demonstrate the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability.
format Online
Article
Text
id pubmed-8176525
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-81765252021-06-16 A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method Azhir, Elham Jafari Navimipour, Nima Hosseinzadeh, Mehdi Sharifi, Arash Darwesh, Aso PeerJ Comput Sci Algorithms and Analysis of Algorithms Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each with a corresponding execution cost. To produce an effective query plan thus requires examining a large number of alternative plans. Access plan recommendation is an alternative technique to database query optimization, which reuses the previously-generated QEPs to execute new queries. In this technique, the query optimizer uses clustering methods to identify groups of similar queries. However, clustering such large datasets is challenging for traditional clustering algorithms due to huge processing time. Numerous cloud-based platforms have been introduced that offer low-cost solutions for the processing of distributed queries such as Hadoop, Hive, Pig, etc. This paper has applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce. The results demonstrate the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability. PeerJ Inc. 2021-06-01 /pmc/articles/PMC8176525/ /pubmed/34141897 http://dx.doi.org/10.7717/peerj-cs.580 Text en ©2021 Azhir et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Azhir, Elham
Jafari Navimipour, Nima
Hosseinzadeh, Mehdi
Sharifi, Arash
Darwesh, Aso
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title_full A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title_fullStr A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title_full_unstemmed A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title_short A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
title_sort technique for parallel query optimization using mapreduce framework and a semantic-based clustering method
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176525/
https://www.ncbi.nlm.nih.gov/pubmed/34141897
http://dx.doi.org/10.7717/peerj-cs.580
work_keys_str_mv AT azhirelham atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT jafarinavimipournima atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT hosseinzadehmehdi atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT sharifiarash atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT darweshaso atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT azhirelham techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT jafarinavimipournima techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT hosseinzadehmehdi techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT sharifiarash techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod
AT darweshaso techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod