Cargando…
Storing Data Flow Monitoring in Hadoop
The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2013
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/1596239 |
_version_ | 1780931163533082624 |
---|---|
author | Georgiou, Anastasia |
author_facet | Georgiou, Anastasia |
author_sort | Georgiou, Anastasia |
collection | CERN |
description | The on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies and organizations need to find new innovative approaches to process such big volumes of data, known as “big data”. The Big Data approach is trying to address the problem of a large and complex collection of data sets that become difficult to handle using traditional data processing applications. Using these new technologies, it should be possible to store all the monitoring information for a time window of months or a year. This report contains an initial evaluation of Hadoop for storage of data flow monitoring and subsequent data mining. |
id | cern-1596239 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2013 |
record_format | invenio |
spelling | cern-15962392019-09-30T06:29:59Zhttp://cds.cern.ch/record/1596239engGeorgiou, AnastasiaStoring Data Flow Monitoring in HadoopComputing and ComputersThe on-line data flow monitoring for the CMS data acquisition system produces a large amount of data. Only 5% of data is stored permanently in a relational database due to performance issues and the cost for using dedicated infrastructure (e.g. Oracle systems). In a commercial environment, companies and organizations need to find new innovative approaches to process such big volumes of data, known as “big data”. The Big Data approach is trying to address the problem of a large and complex collection of data sets that become difficult to handle using traditional data processing applications. Using these new technologies, it should be possible to store all the monitoring information for a time window of months or a year. This report contains an initial evaluation of Hadoop for storage of data flow monitoring and subsequent data mining.CERN-STUDENTS-Note-2013-144oai:cds.cern.ch:15962392013-08-30 |
spellingShingle | Computing and Computers Georgiou, Anastasia Storing Data Flow Monitoring in Hadoop |
title | Storing Data Flow Monitoring in Hadoop |
title_full | Storing Data Flow Monitoring in Hadoop |
title_fullStr | Storing Data Flow Monitoring in Hadoop |
title_full_unstemmed | Storing Data Flow Monitoring in Hadoop |
title_short | Storing Data Flow Monitoring in Hadoop |
title_sort | storing data flow monitoring in hadoop |
topic | Computing and Computers |
url | http://cds.cern.ch/record/1596239 |
work_keys_str_mv | AT georgiouanastasia storingdataflowmonitoringinhadoop |