Cargando…

Monitoring of services with non-relational databases and map-reduce framework

Service Availability Monitoring (SAM) is a well-established monitoring framework that performs regular measurements of the core site services and reports the corresponding availability and reliability of the Worldwide LHC Computing Grid (WLCG) infrastructure. One of the existing extensions of SAM is...

Descripción completa

Detalles Bibliográficos
Autores principales: Babik, M, Souto, F
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/396/5/052008
http://cds.cern.ch/record/1457992
_version_ 1780925150143709184
author Babik, M
Souto, F
author_facet Babik, M
Souto, F
author_sort Babik, M
collection CERN
description Service Availability Monitoring (SAM) is a well-established monitoring framework that performs regular measurements of the core site services and reports the corresponding availability and reliability of the Worldwide LHC Computing Grid (WLCG) infrastructure. One of the existing extensions of SAM is Site Wide Area Testing (SWAT), which gathers monitoring information from the worker nodes via instrumented jobs. This generates quite a lot of monitoring data to process, as there are several data points for every job and several million jobs are executed every day. The recent uptake of non-relational databases opens a new paradigm in the large-scale storage and distributed processing of systems with heavy read-write workloads. For SAM this brings new possibilities to improve its model, from performing aggregation of measurements to storing raw data and subsequent re-processing. Both SAM and SWAT are currently tuned to run at top performance, reaching some of the limits in storage and processing power of their existing Oracle relational database. We investigated the usability and performance of non-relational storage together with its distributed data processing capabilities. For this, several popular systems have been compared. In this contribution we describe our investigation of the existing non-relational databases suited for monitoring systems covering Cassandra, HBase and MongoDB. Further, we present our experiences in data modeling and prototyping map-reduce algorithms focusing on the extension of the already existing availability and reliability computations. Finally, possible future directions in this area are discussed, analyzing the current deficiencies of the existing Grid monitoring systems and proposing solutions to leverage the benefits of the non-relational databases to get more scalable and flexible frameworks.
id cern-1457992
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2012
record_format invenio
spelling cern-14579922022-08-17T13:32:59Zdoi:10.1088/1742-6596/396/5/052008http://cds.cern.ch/record/1457992engBabik, MSouto, FMonitoring of services with non-relational databases and map-reduce frameworkComputing and ComputersService Availability Monitoring (SAM) is a well-established monitoring framework that performs regular measurements of the core site services and reports the corresponding availability and reliability of the Worldwide LHC Computing Grid (WLCG) infrastructure. One of the existing extensions of SAM is Site Wide Area Testing (SWAT), which gathers monitoring information from the worker nodes via instrumented jobs. This generates quite a lot of monitoring data to process, as there are several data points for every job and several million jobs are executed every day. The recent uptake of non-relational databases opens a new paradigm in the large-scale storage and distributed processing of systems with heavy read-write workloads. For SAM this brings new possibilities to improve its model, from performing aggregation of measurements to storing raw data and subsequent re-processing. Both SAM and SWAT are currently tuned to run at top performance, reaching some of the limits in storage and processing power of their existing Oracle relational database. We investigated the usability and performance of non-relational storage together with its distributed data processing capabilities. For this, several popular systems have been compared. In this contribution we describe our investigation of the existing non-relational databases suited for monitoring systems covering Cassandra, HBase and MongoDB. Further, we present our experiences in data modeling and prototyping map-reduce algorithms focusing on the extension of the already existing availability and reliability computations. Finally, possible future directions in this area are discussed, analyzing the current deficiencies of the existing Grid monitoring systems and proposing solutions to leverage the benefits of the non-relational databases to get more scalable and flexible frameworks.CERN-IT-Note-2012-019oai:cds.cern.ch:14579922012-06-01
spellingShingle Computing and Computers
Babik, M
Souto, F
Monitoring of services with non-relational databases and map-reduce framework
title Monitoring of services with non-relational databases and map-reduce framework
title_full Monitoring of services with non-relational databases and map-reduce framework
title_fullStr Monitoring of services with non-relational databases and map-reduce framework
title_full_unstemmed Monitoring of services with non-relational databases and map-reduce framework
title_short Monitoring of services with non-relational databases and map-reduce framework
title_sort monitoring of services with non-relational databases and map-reduce framework
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/396/5/052008
http://cds.cern.ch/record/1457992
work_keys_str_mv AT babikm monitoringofserviceswithnonrelationaldatabasesandmapreduceframework
AT soutof monitoringofserviceswithnonrelationaldatabasesandmapreduceframework