Cargando…
Subscribing to big data at scale
Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/ https://www.ncbi.nlm.nih.gov/pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w |
_version_ | 1784682761336913920 |
---|---|
author | Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. |
author_facet | Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. |
author_sort | Wang, Xikui |
collection | PubMed |
description | Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system. |
format | Online Article Text |
id | pubmed-8987522 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-89875222022-04-07 Subscribing to big data at scale Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. Distrib Parallel Databases Article Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisfy both passive and active requests at scale, application developers need either to heavily customize an existing passive Big Data system or to glue one together with systems like Streaming Engines and Pub-sub services. Either choice requires significant effort and incurs additional overhead. In this paper, we present the BAD (Big Active Data) system as an end-to-end, out-of-the-box solution for this challenge. It is designed to preserve the merits of passive Big Data systems and introduces new features for actively serving Big Data to users at scale. We show the design and implementation of the BAD system, demonstrate how BAD facilitates providing both passive and active data services, investigate the BAD system’s performance at scale, and illustrate the complexities that would result from instead providing BAD-like services with a “glued” system. Springer US 2022-04-07 2022 /pmc/articles/PMC8987522/ /pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Wang, Xikui Carey, Michael J. Tsotras, Vassilis J. Subscribing to big data at scale |
title | Subscribing to big data at scale |
title_full | Subscribing to big data at scale |
title_fullStr | Subscribing to big data at scale |
title_full_unstemmed | Subscribing to big data at scale |
title_short | Subscribing to big data at scale |
title_sort | subscribing to big data at scale |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8987522/ https://www.ncbi.nlm.nih.gov/pubmed/35411128 http://dx.doi.org/10.1007/s10619-022-07406-w |
work_keys_str_mv | AT wangxikui subscribingtobigdataatscale AT careymichaelj subscribingtobigdataatscale AT tsotrasvassilisj subscribingtobigdataatscale |