Cargando…

The Modern Research Data Portal: a design pattern for networked, data-intensive science

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-m...

Descripción completa

Detalles Bibliográficos
Autores principales: Chard, Kyle, Dart, Eli, Foster, Ian, Shifflett, David, Tuecke, Steven, Williams, Jason
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924693/
https://www.ncbi.nlm.nih.gov/pubmed/33816800
http://dx.doi.org/10.7717/peerj-cs.144
_version_ 1783659142874923008
author Chard, Kyle
Dart, Eli
Foster, Ian
Shifflett, David
Tuecke, Steven
Williams, Jason
author_facet Chard, Kyle
Dart, Eli
Foster, Ian
Shifflett, David
Tuecke, Steven
Williams, Jason
author_sort Chard, Kyle
collection PubMed
description We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.
format Online
Article
Text
id pubmed-7924693
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-79246932021-04-02 The Modern Research Data Portal: a design pattern for networked, data-intensive science Chard, Kyle Dart, Eli Foster, Ian Shifflett, David Tuecke, Steven Williams, Jason PeerJ Comput Sci Computer Networks and Communications We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs. We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities. Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals. PeerJ Inc. 2018-01-15 /pmc/articles/PMC7924693/ /pubmed/33816800 http://dx.doi.org/10.7717/peerj-cs.144 Text en ©2018 Chard et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Computer Networks and Communications
Chard, Kyle
Dart, Eli
Foster, Ian
Shifflett, David
Tuecke, Steven
Williams, Jason
The Modern Research Data Portal: a design pattern for networked, data-intensive science
title The Modern Research Data Portal: a design pattern for networked, data-intensive science
title_full The Modern Research Data Portal: a design pattern for networked, data-intensive science
title_fullStr The Modern Research Data Portal: a design pattern for networked, data-intensive science
title_full_unstemmed The Modern Research Data Portal: a design pattern for networked, data-intensive science
title_short The Modern Research Data Portal: a design pattern for networked, data-intensive science
title_sort modern research data portal: a design pattern for networked, data-intensive science
topic Computer Networks and Communications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924693/
https://www.ncbi.nlm.nih.gov/pubmed/33816800
http://dx.doi.org/10.7717/peerj-cs.144
work_keys_str_mv AT chardkyle themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT darteli themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT fosterian themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT shifflettdavid themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT tueckesteven themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT williamsjason themodernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT chardkyle modernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT darteli modernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT fosterian modernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT shifflettdavid modernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT tueckesteven modernresearchdataportaladesignpatternfornetworkeddataintensivescience
AT williamsjason modernresearchdataportaladesignpatternfornetworkeddataintensivescience