Cargando…

A Formalization of SQL with Nulls

SQL is the world’s most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard...

Descripción completa

Detalles Bibliográficos
Autores principales: Ricciotti, Wilmer, Cheney, James
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9637088/
https://www.ncbi.nlm.nih.gov/pubmed/36353685
http://dx.doi.org/10.1007/s10817-022-09632-4
_version_ 1784825098996285440
author Ricciotti, Wilmer
Cheney, James
author_facet Ricciotti, Wilmer
Cheney, James
author_sort Ricciotti, Wilmer
collection PubMed
description SQL is the world’s most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, lateral joins, nested subqueries, and, crucially, null values. Null values complicate SQL’s semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, lateral joins, nested subqueries, and nulls, written in the Coq proof assistant, and describe the validation of key metatheoretic properties. Additionally, we are able to use the same framework to formalize the semantics of a flat relational calculus (with null values), and show a certified translation of its normal forms into SQL.
format Online
Article
Text
id pubmed-9637088
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-96370882022-11-07 A Formalization of SQL with Nulls Ricciotti, Wilmer Cheney, James J Autom Reason Article SQL is the world’s most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, lateral joins, nested subqueries, and, crucially, null values. Null values complicate SQL’s semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, lateral joins, nested subqueries, and nulls, written in the Coq proof assistant, and describe the validation of key metatheoretic properties. Additionally, we are able to use the same framework to formalize the semantics of a flat relational calculus (with null values), and show a certified translation of its normal forms into SQL. Springer Netherlands 2022-07-27 2022 /pmc/articles/PMC9637088/ /pubmed/36353685 http://dx.doi.org/10.1007/s10817-022-09632-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ricciotti, Wilmer
Cheney, James
A Formalization of SQL with Nulls
title A Formalization of SQL with Nulls
title_full A Formalization of SQL with Nulls
title_fullStr A Formalization of SQL with Nulls
title_full_unstemmed A Formalization of SQL with Nulls
title_short A Formalization of SQL with Nulls
title_sort formalization of sql with nulls
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9637088/
https://www.ncbi.nlm.nih.gov/pubmed/36353685
http://dx.doi.org/10.1007/s10817-022-09632-4
work_keys_str_mv AT ricciottiwilmer aformalizationofsqlwithnulls
AT cheneyjames aformalizationofsqlwithnulls
AT ricciottiwilmer formalizationofsqlwithnulls
AT cheneyjames formalizationofsqlwithnulls