Cargando…

Subscribing to big data at scale

Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Xikui, Carey, Michael J., Tsotras, Vassilis J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/ https://www.ncbi.nlm.nih.gov/pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w

_version_	1784682761336913920
author	Wang, Xikui Carey, Michael J. Tsotras, Vassilis J.
author_facet	Wang, Xikui Carey, Michael J. Tsotras, Vassilis J.
author_sort	Wang, Xikui
collection	PubMed
description	Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system.
format	Online Article Text
id	pubmed-8987522
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-89875222022-04-07 Subscribing to big data at scale Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. Distrib Parallel Databases Article Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system. Springer US 2022-04-07 2022 /pmc/articles/PMC8987522/ /pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. Subscribing to big data at scale
title	Subscribing to big data at scale
title_full	Subscribing to big data at scale
title_fullStr	Subscribing to big data at scale
title_full_unstemmed	Subscribing to big data at scale
title_short	Subscribing to big data at scale
title_sort	subscribing to big data at scale
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/ https://www.ncbi.nlm.nih.gov/pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w
work_keys_str_mv	AT wangxikui subscribingtobigdataatscale AT careymichaelj subscribingtobigdataatscale AT tsotrasvassilisj subscribingtobigdataatscale

Subscribing to big data at scale

Ejemplares similares