Cargando…

Pushing ML Predictions Into DBMSs

In the past decade, many approaches have been suggested to execute ML workloads on a DBMS. However, most of them have looked at in-DBMS ML from a training perspective, whereas ML inference has been largely overlooked. We think that this is an important gap to fill for two main reasons: (1) in the ne...

Descripción completa

Detalles Bibliográficos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IEEE 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620958/
https://www.ncbi.nlm.nih.gov/pubmed/37954972
http://dx.doi.org/10.1109/TKDE.2023.3269592
_version_ 1785130314005217280
collection PubMed
description In the past decade, many approaches have been suggested to execute ML workloads on a DBMS. However, most of them have looked at in-DBMS ML from a training perspective, whereas ML inference has been largely overlooked. We think that this is an important gap to fill for two main reasons: (1) in the near future, every application will be infused with some sort of ML capability; (2) behind every web page, application, and enterprise there is a DBMS, whereby in-DBMS inference is an appealing solution both for efficiency (e.g., less data movement), performance (e.g., cross-optimizations between relational operators and ML) and governance. In this article, we study whether DBMSs are a good fit for prediction serving. We introduce a technique for translating trained ML pipelines containing both featurizers (e.g., one-hot encoding) and models (e.g., linear and tree-based models) into SQL queries, and we compare in-DBMS performance against popular ML frameworks such as Sklearn and ml.net. Our experiments show that, when pushed inside a DBMS, trained ML pipelines can have performance comparable to ML frameworks in several scenarios, while they perform quite poorly on text featurization and over (even simple) neural networks.
format Online
Article
Text
id pubmed-10620958
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher IEEE
record_format MEDLINE/PubMed
spelling pubmed-106209582023-11-07 Pushing ML Predictions Into DBMSs IEEE Trans Knowl Data Eng Article In the past decade, many approaches have been suggested to execute ML workloads on a DBMS. However, most of them have looked at in-DBMS ML from a training perspective, whereas ML inference has been largely overlooked. We think that this is an important gap to fill for two main reasons: (1) in the near future, every application will be infused with some sort of ML capability; (2) behind every web page, application, and enterprise there is a DBMS, whereby in-DBMS inference is an appealing solution both for efficiency (e.g., less data movement), performance (e.g., cross-optimizations between relational operators and ML) and governance. In this article, we study whether DBMSs are a good fit for prediction serving. We introduce a technique for translating trained ML pipelines containing both featurizers (e.g., one-hot encoding) and models (e.g., linear and tree-based models) into SQL queries, and we compare in-DBMS performance against popular ML frameworks such as Sklearn and ml.net. Our experiments show that, when pushed inside a DBMS, trained ML pipelines can have performance comparable to ML frameworks in several scenarios, while they perform quite poorly on text featurization and over (even simple) neural networks. IEEE 2023-04-24 /pmc/articles/PMC10620958/ /pubmed/37954972 http://dx.doi.org/10.1109/TKDE.2023.3269592 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Pushing ML Predictions Into DBMSs
title Pushing ML Predictions Into DBMSs
title_full Pushing ML Predictions Into DBMSs
title_fullStr Pushing ML Predictions Into DBMSs
title_full_unstemmed Pushing ML Predictions Into DBMSs
title_short Pushing ML Predictions Into DBMSs
title_sort pushing ml predictions into dbmss
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620958/
https://www.ncbi.nlm.nih.gov/pubmed/37954972
http://dx.doi.org/10.1109/TKDE.2023.3269592
work_keys_str_mv AT pushingmlpredictionsintodbmss
AT pushingmlpredictionsintodbmss
AT pushingmlpredictionsintodbmss
AT pushingmlpredictionsintodbmss
AT pushingmlpredictionsintodbmss