Cargando…

Storing Data Flow Monitoring in Hadoop

The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies...

Descripción completa

Detalles Bibliográficos
Autor principal: Georgiou, Anastasia
Lenguaje:eng
Publicado: 2013
Materias:
Acceso en línea:http://cds.cern.ch/record/1596239
Descripción
Sumario:The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies and organizations need to find new innovative approaches to process such big volumes of data, known as “big data”. The Big Data approach is trying to address the problem of a large and complex collection of data sets that become difficult to handle using traditional data processing applications. Using these new technologies, it should be possible to store all the monitoring information for a time window of months or a year. This report contains an initial evaluation of Hadoop for storage of data flow monitoring and subsequent data mining.