Cargando…

Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites

Performance problems in applications should ideally be detected as soon as they occur, i.e., directly when the causing code modification is added to the code repository. To this end, complex and cost-intensive application benchmarks or lightweight but less relevant microbenchmarks can be added to ex...

Descripción completa

Detalles Bibliográficos
Autores principales:	Grambow, Martin, Laaber, Christoph, Leitner, Philipp, Bermbach, David
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Databases
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176533/ https://www.ncbi.nlm.nih.gov/pubmed/34141882 http://dx.doi.org/10.7717/peerj-cs.548

_version_	1783703274450321408
author	Grambow, Martin Laaber, Christoph Leitner, Philipp Bermbach, David
author_facet	Grambow, Martin Laaber, Christoph Leitner, Philipp Bermbach, David
author_sort	Grambow, Martin
collection	PubMed
description	Performance problems in applications should ideally be detected as soon as they occur, i.e., directly when the causing code modification is added to the code repository. To this end, complex and cost-intensive application benchmarks or lightweight but less relevant microbenchmarks can be added to existing build pipelines to ensure performance goals. In this paper, we show how the practical relevance of microbenchmark suites can be improved and verified based on the application flow during an application benchmark run. We propose an approach to determine the overlap of common function calls between application and microbenchmarks, describe a method which identifies redundant microbenchmarks, and present a recommendation algorithm which reveals relevant functions that are not covered by microbenchmarks yet. A microbenchmark suite optimized in this way can easily test all functions determined to be relevant by application benchmarks after every code change, thus, significantly reducing the risk of undetected performance problems. Our evaluation using two time series databases shows that, depending on the specific application scenario, application benchmarks cover different functions of the system under test. Their respective microbenchmark suites cover between 35.62% and 66.29% of the functions called during the application benchmark, offering substantial room for improvement. Through two use cases—removing redundancies in the microbenchmark suite and recommendation of yet uncovered functions—we decrease the total number of microbenchmarks and increase the practical relevance of both suites. Removing redundancies can significantly reduce the number of microbenchmarks (and thus the execution time as well) to ~10% and ~23% of the original microbenchmark suites, whereas recommendation identifies up to 26 and 14 newly, uncovered functions to benchmark to improve the relevance. By utilizing the differences and synergies of application benchmarks and microbenchmarks, our approach potentially enables effective software performance assurance with performance tests of multiple granularities.
format	Online Article Text
id	pubmed-8176533
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-81765332021-06-16 Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites Grambow, Martin Laaber, Christoph Leitner, Philipp Bermbach, David PeerJ Comput Sci Databases Performance problems in applications should ideally be detected as soon as they occur, i.e., directly when the causing code modification is added to the code repository. To this end, complex and cost-intensive application benchmarks or lightweight but less relevant microbenchmarks can be added to existing build pipelines to ensure performance goals. In this paper, we show how the practical relevance of microbenchmark suites can be improved and verified based on the application flow during an application benchmark run. We propose an approach to determine the overlap of common function calls between application and microbenchmarks, describe a method which identifies redundant microbenchmarks, and present a recommendation algorithm which reveals relevant functions that are not covered by microbenchmarks yet. A microbenchmark suite optimized in this way can easily test all functions determined to be relevant by application benchmarks after every code change, thus, significantly reducing the risk of undetected performance problems. Our evaluation using two time series databases shows that, depending on the specific application scenario, application benchmarks cover different functions of the system under test. Their respective microbenchmark suites cover between 35.62% and 66.29% of the functions called during the application benchmark, offering substantial room for improvement. Through two use cases—removing redundancies in the microbenchmark suite and recommendation of yet uncovered functions—we decrease the total number of microbenchmarks and increase the practical relevance of both suites. Removing redundancies can significantly reduce the number of microbenchmarks (and thus the execution time as well) to ~10% and ~23% of the original microbenchmark suites, whereas recommendation identifies up to 26 and 14 newly, uncovered functions to benchmark to improve the relevance. By utilizing the differences and synergies of application benchmarks and microbenchmarks, our approach potentially enables effective software performance assurance with performance tests of multiple granularities. PeerJ Inc. 2021-05-28 /pmc/articles/PMC8176533/ /pubmed/34141882 http://dx.doi.org/10.7717/peerj-cs.548 Text en © 2021 Grambow et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Databases Grambow, Martin Laaber, Christoph Leitner, Philipp Bermbach, David Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title	Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title_full	Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title_fullStr	Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title_full_unstemmed	Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title_short	Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
title_sort	using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
topic	Databases
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176533/ https://www.ncbi.nlm.nih.gov/pubmed/34141882 http://dx.doi.org/10.7717/peerj-cs.548
work_keys_str_mv	AT grambowmartin usingapplicationbenchmarkcallgraphstoquantifyandimprovethepracticalrelevanceofmicrobenchmarksuites AT laaberchristoph usingapplicationbenchmarkcallgraphstoquantifyandimprovethepracticalrelevanceofmicrobenchmarksuites AT leitnerphilipp usingapplicationbenchmarkcallgraphstoquantifyandimprovethepracticalrelevanceofmicrobenchmarksuites AT bermbachdavid usingapplicationbenchmarkcallgraphstoquantifyandimprovethepracticalrelevanceofmicrobenchmarksuites

Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites

Ejemplares similares