Cargando…

More Than Preservation: A Researcher-Centered Approach to Reproducibility in Data Science

As data volumes and complexity in data science work increase, reusability of experimental studies becomes increasingly important. Our research in data-intensive High Energy Physics shows that supporting and motivating data workers in preserving and sharing their resources is a key challenge for futu...

Descripción completa

Detalles Bibliográficos
Autores principales: Feger, Sebastian Stefan, Woźniak, Paweł W
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2677268
Descripción
Sumario:As data volumes and complexity in data science work increase, reusability of experimental studies becomes increasingly important. Our research in data-intensive High Energy Physics shows that supporting and motivating data workers in preserving and sharing their resources is a key challenge for future data science work. We report on our studies of practices around preservation that show how secondary uses of preservation technology and non-conventional design tools incentivize best practices and reshape analysts' perceptions of research tools. We expect that our work will impact sustainability and reusability in data science well beyond experimental physics.