Cargando…

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supe...

Descripción completa

Detalles Bibliográficos
Autores principales: Huber, Sebastiaan P., Zoupanos, Spyros, Uhrin, Martin, Talirz, Leopold, Kahle, Leonid, Häuselmann, Rico, Gresch, Dominik, Müller, Tiziano, Yakutovich, Aliaksandr V., Andersen, Casper W., Ramirez, Francisco F., Adorf, Carl S., Gargiulo, Fernando, Kumbhar, Snehal, Passaro, Elsa, Johnston, Conrad, Merkys, Andrius, Cepellotti, Andrea, Mounet, Nicolas, Marzari, Nicola, Kozinsky, Boris, Pizzi, Giovanni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479590/
https://www.ncbi.nlm.nih.gov/pubmed/32901044
http://dx.doi.org/10.1038/s41597-020-00638-4
_version_ 1783580305819435008
author Huber, Sebastiaan P.
Zoupanos, Spyros
Uhrin, Martin
Talirz, Leopold
Kahle, Leonid
Häuselmann, Rico
Gresch, Dominik
Müller, Tiziano
Yakutovich, Aliaksandr V.
Andersen, Casper W.
Ramirez, Francisco F.
Adorf, Carl S.
Gargiulo, Fernando
Kumbhar, Snehal
Passaro, Elsa
Johnston, Conrad
Merkys, Andrius
Cepellotti, Andrea
Mounet, Nicolas
Marzari, Nicola
Kozinsky, Boris
Pizzi, Giovanni
author_facet Huber, Sebastiaan P.
Zoupanos, Spyros
Uhrin, Martin
Talirz, Leopold
Kahle, Leonid
Häuselmann, Rico
Gresch, Dominik
Müller, Tiziano
Yakutovich, Aliaksandr V.
Andersen, Casper W.
Ramirez, Francisco F.
Adorf, Carl S.
Gargiulo, Fernando
Kumbhar, Snehal
Passaro, Elsa
Johnston, Conrad
Merkys, Andrius
Cepellotti, Andrea
Mounet, Nicolas
Marzari, Nicola
Kozinsky, Boris
Pizzi, Giovanni
author_sort Huber, Sebastiaan P.
collection PubMed
description The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA’s workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with external simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible.
format Online
Article
Text
id pubmed-7479590
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-74795902020-09-21 AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance Huber, Sebastiaan P. Zoupanos, Spyros Uhrin, Martin Talirz, Leopold Kahle, Leonid Häuselmann, Rico Gresch, Dominik Müller, Tiziano Yakutovich, Aliaksandr V. Andersen, Casper W. Ramirez, Francisco F. Adorf, Carl S. Gargiulo, Fernando Kumbhar, Snehal Passaro, Elsa Johnston, Conrad Merkys, Andrius Cepellotti, Andrea Mounet, Nicolas Marzari, Nicola Kozinsky, Boris Pizzi, Giovanni Sci Data Article The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA’s workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with external simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible. Nature Publishing Group UK 2020-09-08 /pmc/articles/PMC7479590/ /pubmed/32901044 http://dx.doi.org/10.1038/s41597-020-00638-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Huber, Sebastiaan P.
Zoupanos, Spyros
Uhrin, Martin
Talirz, Leopold
Kahle, Leonid
Häuselmann, Rico
Gresch, Dominik
Müller, Tiziano
Yakutovich, Aliaksandr V.
Andersen, Casper W.
Ramirez, Francisco F.
Adorf, Carl S.
Gargiulo, Fernando
Kumbhar, Snehal
Passaro, Elsa
Johnston, Conrad
Merkys, Andrius
Cepellotti, Andrea
Mounet, Nicolas
Marzari, Nicola
Kozinsky, Boris
Pizzi, Giovanni
AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title_full AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title_fullStr AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title_full_unstemmed AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title_short AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
title_sort aiida 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479590/
https://www.ncbi.nlm.nih.gov/pubmed/32901044
http://dx.doi.org/10.1038/s41597-020-00638-4
work_keys_str_mv AT hubersebastiaanp aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT zoupanosspyros aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT uhrinmartin aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT talirzleopold aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT kahleleonid aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT hauselmannrico aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT greschdominik aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT mullertiziano aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT yakutovichaliaksandrv aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT andersencasperw aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT ramirezfranciscof aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT adorfcarls aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT gargiulofernando aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT kumbharsnehal aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT passaroelsa aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT johnstonconrad aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT merkysandrius aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT cepellottiandrea aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT mounetnicolas aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT marzarinicola aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT kozinskyboris aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance
AT pizzigiovanni aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance