Cargando…

iRODS Research Community Requirements Drive Expanded Scale Data Management Features

<!--HTML-->Several years ago, the entire process of data management and collaboration could only be performed with the use of proprietary software products that were expensive to license. To maintain a collection, data sites required a file system, hierarchical storage management system, and s...

Descripción completa

Detalles Bibliográficos
Autor principal: Russell, Terrell
Lenguaje:eng
Publicado: 2022
Materias:
Acceso en línea:http://cds.cern.ch/record/2802313
_version_ 1780972740589649920
author Russell, Terrell
author_facet Russell, Terrell
author_sort Russell, Terrell
collection CERN
description <!--HTML-->Several years ago, the entire process of data management and collaboration could only be performed with the use of proprietary software products that were expensive to license. To maintain a collection, data sites required a file system, hierarchical storage management system, and some means of sharing the data over several geographically diverse sites using purchased software, often from a single vendor to ensure compatibility. Data site managers were placed in a difficult position facing quickly growing data capacity and transmission demands with limited budgets. Constraints from funding agencies and governments became very difficult, if not impossible, to manage and audit. The iRODS (Integrated Rule-Oriented Data System) Consortium was started as an open-source software development organization in 2013 by members of the research and storage communities. The technology has roots from an earlier project started in 1995. The Consortium was launched as a response to a major scale increase in management and storage needs driven by the advent of "big data". The member community is now comprised of over 30 members and spans the globe from the Australia to Japan and much of the EU. Recent innovations as a result of community requirements will be discussed including graphical interfaces and methods to ensure data persistence and replication management. In addition, partnerships will be discussed with Globus and others to enable large scale collaboration. Today, worldwide, FAIR discovery and directed dissemination of HPC results are being accomplished in sites controlling tens of petabytes of data with this open-source technology.
id cern-2802313
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2022
record_format invenio
spelling cern-28023132022-11-02T22:04:02Zhttp://cds.cern.ch/record/2802313engRussell, TerrelliRODS Research Community Requirements Drive Expanded Scale Data Management FeaturesCS3 2022 - Cloud Storage Synchronization and SharingHEP Computing<!--HTML-->Several years ago, the entire process of data management and collaboration could only be performed with the use of proprietary software products that were expensive to license. To maintain a collection, data sites required a file system, hierarchical storage management system, and some means of sharing the data over several geographically diverse sites using purchased software, often from a single vendor to ensure compatibility. Data site managers were placed in a difficult position facing quickly growing data capacity and transmission demands with limited budgets. Constraints from funding agencies and governments became very difficult, if not impossible, to manage and audit. The iRODS (Integrated Rule-Oriented Data System) Consortium was started as an open-source software development organization in 2013 by members of the research and storage communities. The technology has roots from an earlier project started in 1995. The Consortium was launched as a response to a major scale increase in management and storage needs driven by the advent of "big data". The member community is now comprised of over 30 members and spans the globe from the Australia to Japan and much of the EU. Recent innovations as a result of community requirements will be discussed including graphical interfaces and methods to ensure data persistence and replication management. In addition, partnerships will be discussed with Globus and others to enable large scale collaboration. Today, worldwide, FAIR discovery and directed dissemination of HPC results are being accomplished in sites controlling tens of petabytes of data with this open-source technology.oai:cds.cern.ch:28023132022
spellingShingle HEP Computing
Russell, Terrell
iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title_full iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title_fullStr iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title_full_unstemmed iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title_short iRODS Research Community Requirements Drive Expanded Scale Data Management Features
title_sort irods research community requirements drive expanded scale data management features
topic HEP Computing
url http://cds.cern.ch/record/2802313
work_keys_str_mv AT russellterrell irodsresearchcommunityrequirementsdriveexpandedscaledatamanagementfeatures
AT russellterrell cs32022cloudstoragesynchronizationandsharing