Cargando…
Creating an Open Archival Information System compliant archive for CERN
Nowadays, we constantly produce data in an unprecedented scale at various domains. In the context of research data, large organizations, like CERN, produce information which is of significant importance and which cannot be reproduced in the future. It is therefore our responsibility to make sure tha...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2023
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2857666 |
Sumario: | Nowadays, we constantly produce data in an unprecedented scale at various domains. In the context of research data, large organizations, like CERN, produce information which is of significant importance and which cannot be reproduced in the future. It is therefore our responsibility to make sure that this information is preserved in a way that it will be available to the future generations. This challenge, which is broadly referred to as digital preservation, has drawn the attention of several researchers and led to the design of a standard for long-term dig- ital data storage, known as the Open Archival Information System (OAIS) standard. Several systems have been developed towards this direction, however these solu- tions were either not fully-compliant with the OAIS standard, some were short term projects that have been decommissioned or they were not open-source and available to the research community. In this context, CERN proposed the Digital Memory project, a digital archiving initiative that should allow researchers to archive their data in a way that it will be accessible in the future. In this thesis, which is part of the Digital Memory project, we confront the aforementioned challenges by proposing an architecture that is fully OAIS-compliant, is integrated with CERN repositories and supports transparency, as the user can easily manage and monitor the actions performed on archival packages. Initially, we implement a tool that can be used to harvest data from various CERN sources like CDS, Indico, CERN Open Data, Gitlab and CodiMD in an OAIS-compliant format called Submission Information Package (SIP). This package can be supplied to the platform in order to create the actual archival packages that can be stored for long term preservation. These packages contain additional metadata and normalization of content that will guarantee long term survival of the information content. Additionally, we show how easy it is for a user to create, monitor and group their archives by using the User Interface. The platform can be easily deployed by anyone on Open- shift with the use of Helm charts. Concerning our evaluation we discuss how the the performance of the platform can be improved and we show that the resulting packages as well as the platform as a whole is fully OAIS-compliant. |
---|