Cargando…

Storing Data Flow Monitoring in Hadoop

The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies...

Descripción completa

Detalles Bibliográficos
Autor principal: Georgiou, Anastasia
Lenguaje:eng
Publicado: 2013
Materias:
Acceso en línea:http://cds.cern.ch/record/1596239
_version_ 1780931163533082624
author Georgiou, Anastasia
author_facet Georgiou, Anastasia
author_sort Georgiou, Anastasia
collection CERN
description The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies and organizations need to find new innovative approaches to process such big volumes of data, known as “big data”. The Big Data approach is trying to address the problem of a large and complex collection of data sets that become difficult to handle using traditional data processing applications. Using these new technologies, it should be possible to store all the monitoring information for a time window of months or a year. This report contains an initial evaluation of Hadoop for storage of data flow monitoring and subsequent data mining.
id cern-1596239
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2013
record_format invenio
spelling cern-15962392019-09-30T06:29:59Zhttp://cds.cern.ch/record/1596239engGeorgiou, AnastasiaStoring Data Flow Monitoring in HadoopComputing and ComputersThe on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies and organizations need to find new innovative approaches to process such big volumes of data, known as “big data”. The Big Data approach is trying to address the problem of a large and complex collection of data sets that become difficult to handle using traditional data processing applications. Using these new technologies, it should be possible to store all the monitoring information for a time window of months or a year. This report contains an initial evaluation of Hadoop for storage of data flow monitoring and subsequent data mining.CERN-STUDENTS-Note-2013-144oai:cds.cern.ch:15962392013-08-30
spellingShingle Computing and Computers
Georgiou, Anastasia
Storing Data Flow Monitoring in Hadoop
title Storing Data Flow Monitoring in Hadoop
title_full Storing Data Flow Monitoring in Hadoop
title_fullStr Storing Data Flow Monitoring in Hadoop
title_full_unstemmed Storing Data Flow Monitoring in Hadoop
title_short Storing Data Flow Monitoring in Hadoop
title_sort storing data flow monitoring in hadoop
topic Computing and Computers
url http://cds.cern.ch/record/1596239
work_keys_str_mv AT georgiouanastasia storingdataflowmonitoringinhadoop