Cargando…

Dataset Classification

The open data of the CMS experiment includes several thousand simulation datasets. To make this data searchable, category tags are assigned to each dataset describing the type of physical process it contains. In the open data sample to be released next, there are hundreds of datasets without assigne...

Descripción completa

Detalles Bibliográficos
Autores principales: Davalos Carrera, Josue Eulises, Elenter Litwin, Juan
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:http://cds.cern.ch/record/2789079
_version_ 1780972170700128256
author Davalos Carrera, Josue Eulises
Elenter Litwin, Juan
author_facet Davalos Carrera, Josue Eulises
Elenter Litwin, Juan
author_sort Davalos Carrera, Josue Eulises
collection CERN
description The open data of the CMS experiment includes several thousand simulation datasets. To make this data searchable, category tags are assigned to each dataset describing the type of physical process it contains. In the open data sample to be released next, there are hundreds of datasets without assigned categories. A correct assignment requires knowledge of the particular physics process and can be tedious to verify because of the large variety of different processes present. The goal of this project is to create a game that allows CMS physicists to contribute to the work of category assignment in an entertaining way (as opposed to going through a long list of dataset names). The game GUI would pop up a yet uncategorized dataset and ask the gamer to assign it into a category (or propose a new one). Additionally, administrators will have a Dashboard that will show statistics of the labeled dataset. These statistics are computed taking into account the reliability of the player.
id cern-2789079
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-27890792021-10-29T20:07:23Zhttp://cds.cern.ch/record/2789079engDavalos Carrera, Josue EulisesElenter Litwin, JuanDataset ClassificationInformation Transfer and ManagementComputing and ComputersThe open data of the CMS experiment includes several thousand simulation datasets. To make this data searchable, category tags are assigned to each dataset describing the type of physical process it contains. In the open data sample to be released next, there are hundreds of datasets without assigned categories. A correct assignment requires knowledge of the particular physics process and can be tedious to verify because of the large variety of different processes present. The goal of this project is to create a game that allows CMS physicists to contribute to the work of category assignment in an entertaining way (as opposed to going through a long list of dataset names). The game GUI would pop up a yet uncategorized dataset and ask the gamer to assign it into a category (or propose a new one). Additionally, administrators will have a Dashboard that will show statistics of the labeled dataset. These statistics are computed taking into account the reliability of the player. CERN-STUDENTS-Note-2021-228oai:cds.cern.ch:27890792021-10-29
spellingShingle Information Transfer and Management
Computing and Computers
Davalos Carrera, Josue Eulises
Elenter Litwin, Juan
Dataset Classification
title Dataset Classification
title_full Dataset Classification
title_fullStr Dataset Classification
title_full_unstemmed Dataset Classification
title_short Dataset Classification
title_sort dataset classification
topic Information Transfer and Management
Computing and Computers
url http://cds.cern.ch/record/2789079
work_keys_str_mv AT davaloscarrerajosueeulises datasetclassification
AT elenterlitwinjuan datasetclassification