Cargando…
Evaluating the integration of distributed tracing signals into the CERN Monitoring Infrastructure
<!--HTML-->In order to have system monitoring, we need telemetry – data about the system’s behavior, emitted from the system itself. Telemetry data comes in three forms – logs, metrics and traces. Logs and metrics are already utilized by the current CERN Monitoring infrastructure. Logs provide...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2023
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2868468 |
Sumario: | <!--HTML-->In order to have system monitoring, we need telemetry – data about the system’s behavior, emitted from the system itself. Telemetry data comes in three forms – logs, metrics and traces. Logs and metrics are already utilized by the current CERN Monitoring infrastructure. Logs provide information on individual events and include some local context, providing debugging/diagnostic information by describing the immediate surrounding of the event. On the other hand, metrics give system/service level information though the aggregation of measurements across time. Just by using logs and metrics we have discontinuity of information – we can observe individual events and we can observe the system state across time, but reasoning about causality between the two is left as a responsibility of the developer/engineer. Traces aspire to bridge this gap; they link each individual event with the tree of invocation dependencies from the user request that started the chain to all of the side effects it had in the system. The goal of this project is to create a proof-of-concept deployment of the distributed tracing infrastructure, evaluate the possibilities of its integration with the existing monitoring infrastructure and assess the possibilities these additions would unlock towards improving the verbosity and quality of information the monitoring team can provide its clients regarding the behavior of their systems. |
---|