Cargando…

Subscribing to big data at scale

Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xikui, Carey, Michael J., Tsotras, Vassilis J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/
https://www.ncbi.nlm.nih.gov/pubmed/35411128
http://dx.doi.org/10.1007/s10619-022-07406-w
_version_ 1784682761336913920
author Wang, Xikui
Carey, Michael J.
Tsotras, Vassilis J.
author_facet Wang, Xikui
Carey, Michael J.
Tsotras, Vassilis J.
author_sort Wang, Xikui
collection PubMed
description Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system.
format Online
Article
Text
id pubmed-8987522
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-89875222022-04-07 Subscribing to big data at scale Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. Distrib Parallel Databases Article Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system. Springer US 2022-04-07 2022 /pmc/articles/PMC8987522/ /pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Wang, Xikui
Carey, Michael J.
Tsotras, Vassilis J.
Subscribing to big data at scale
title Subscribing to big data at scale
title_full Subscribing to big data at scale
title_fullStr Subscribing to big data at scale
title_full_unstemmed Subscribing to big data at scale
title_short Subscribing to big data at scale
title_sort subscribing to big data at scale
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/
https://www.ncbi.nlm.nih.gov/pubmed/35411128
http://dx.doi.org/10.1007/s10619-022-07406-w
work_keys_str_mv AT wangxikui subscribingtobigdataatscale
AT careymichaelj subscribingtobigdataatscale
AT tsotrasvassilisj subscribingtobigdataatscale