Cargando…

gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair

The explosive volume of semantic data published in the Resource Description Framework (RDF) data model demands efficient management and compression with better compression ratio and runtime. Although extensive work has been carried out for compressing the RDF datasets, they do not perform well in al...

Descripción completa

Detalles Bibliográficos
Autores principales: Sultana, Tangina, Lee, Young-Koo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003471/
https://www.ncbi.nlm.nih.gov/pubmed/35408160
http://dx.doi.org/10.3390/s22072545
_version_ 1784686142177673216
author Sultana, Tangina
Lee, Young-Koo
author_facet Sultana, Tangina
Lee, Young-Koo
author_sort Sultana, Tangina
collection PubMed
description The explosive volume of semantic data published in the Resource Description Framework (RDF) data model demands efficient management and compression with better compression ratio and runtime. Although extensive work has been carried out for compressing the RDF datasets, they do not perform well in all dimensions. However, these compressors rarely exploit the graph patterns and structural regularities of real-world datasets. Moreover, there are a variety of existing approaches that reduce the size of a graph by using a grammar-based graph compression algorithm. In this study, we introduce a novel approach named gRDF (graph repair for RDF) that uses gRePair, one of the most efficient grammar-based graph compression schemes, to compress the RDF dataset. In addition to that, we have improved the performance of HDT (header-dictionary-triple), an efficient approach for compressing the RDF datasets based on structural properties, by introducing modified HDT (M-HDT). It can detect the frequent graph pattern by employing the data-structure-oriented approach in a single pass from the dataset. In our proposed system, we use M-HDT for indexing the nodes and edge labels. Then, we employ gRePair algorithm for identifying the grammar from the RDF graph. Afterward, the system improves the performance of [Formula: see text]-trees by introducing a more efficient algorithm to create the trees and serialize the RDF datasets. Our experiments affirm that the proposed gRDF scheme can substantially achieve at approximately 26.12%, 13.68%, 6.81%, 2.38%, and 12.76% better compression ratio when compared with the most prominent state-of-the-art schemes such as HDT, HDT++, [Formula: see text]-trees, RDF-TR, and gRePair in the case of real-world datasets. Moreover, the processing efficiency of our proposed scheme also outperforms others.
format Online
Article
Text
id pubmed-9003471
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90034712022-04-13 gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair Sultana, Tangina Lee, Young-Koo Sensors (Basel) Article The explosive volume of semantic data published in the Resource Description Framework (RDF) data model demands efficient management and compression with better compression ratio and runtime. Although extensive work has been carried out for compressing the RDF datasets, they do not perform well in all dimensions. However, these compressors rarely exploit the graph patterns and structural regularities of real-world datasets. Moreover, there are a variety of existing approaches that reduce the size of a graph by using a grammar-based graph compression algorithm. In this study, we introduce a novel approach named gRDF (graph repair for RDF) that uses gRePair, one of the most efficient grammar-based graph compression schemes, to compress the RDF dataset. In addition to that, we have improved the performance of HDT (header-dictionary-triple), an efficient approach for compressing the RDF datasets based on structural properties, by introducing modified HDT (M-HDT). It can detect the frequent graph pattern by employing the data-structure-oriented approach in a single pass from the dataset. In our proposed system, we use M-HDT for indexing the nodes and edge labels. Then, we employ gRePair algorithm for identifying the grammar from the RDF graph. Afterward, the system improves the performance of [Formula: see text]-trees by introducing a more efficient algorithm to create the trees and serialize the RDF datasets. Our experiments affirm that the proposed gRDF scheme can substantially achieve at approximately 26.12%, 13.68%, 6.81%, 2.38%, and 12.76% better compression ratio when compared with the most prominent state-of-the-art schemes such as HDT, HDT++, [Formula: see text]-trees, RDF-TR, and gRePair in the case of real-world datasets. Moreover, the processing efficiency of our proposed scheme also outperforms others. MDPI 2022-03-26 /pmc/articles/PMC9003471/ /pubmed/35408160 http://dx.doi.org/10.3390/s22072545 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Sultana, Tangina
Lee, Young-Koo
gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title_full gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title_fullStr gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title_full_unstemmed gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title_short gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair
title_sort grdf: an efficient compressor with reduced structural regularities that utilizes grepair
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003471/
https://www.ncbi.nlm.nih.gov/pubmed/35408160
http://dx.doi.org/10.3390/s22072545
work_keys_str_mv AT sultanatangina grdfanefficientcompressorwithreducedstructuralregularitiesthatutilizesgrepair
AT leeyoungkoo grdfanefficientcompressorwithreducedstructuralregularitiesthatutilizesgrepair