Cargando…
AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance
The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supe...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479590/ https://www.ncbi.nlm.nih.gov/pubmed/32901044 http://dx.doi.org/10.1038/s41597-020-00638-4 |
_version_ | 1783580305819435008 |
---|---|
author | Huber, Sebastiaan P. Zoupanos, Spyros Uhrin, Martin Talirz, Leopold Kahle, Leonid Häuselmann, Rico Gresch, Dominik Müller, Tiziano Yakutovich, Aliaksandr V. Andersen, Casper W. Ramirez, Francisco F. Adorf, Carl S. Gargiulo, Fernando Kumbhar, Snehal Passaro, Elsa Johnston, Conrad Merkys, Andrius Cepellotti, Andrea Mounet, Nicolas Marzari, Nicola Kozinsky, Boris Pizzi, Giovanni |
author_facet | Huber, Sebastiaan P. Zoupanos, Spyros Uhrin, Martin Talirz, Leopold Kahle, Leonid Häuselmann, Rico Gresch, Dominik Müller, Tiziano Yakutovich, Aliaksandr V. Andersen, Casper W. Ramirez, Francisco F. Adorf, Carl S. Gargiulo, Fernando Kumbhar, Snehal Passaro, Elsa Johnston, Conrad Merkys, Andrius Cepellotti, Andrea Mounet, Nicolas Marzari, Nicola Kozinsky, Boris Pizzi, Giovanni |
author_sort | Huber, Sebastiaan P. |
collection | PubMed |
description | The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA’s workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with external simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible. |
format | Online Article Text |
id | pubmed-7479590 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-74795902020-09-21 AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance Huber, Sebastiaan P. Zoupanos, Spyros Uhrin, Martin Talirz, Leopold Kahle, Leonid Häuselmann, Rico Gresch, Dominik Müller, Tiziano Yakutovich, Aliaksandr V. Andersen, Casper W. Ramirez, Francisco F. Adorf, Carl S. Gargiulo, Fernando Kumbhar, Snehal Passaro, Elsa Johnston, Conrad Merkys, Andrius Cepellotti, Andrea Mounet, Nicolas Marzari, Nicola Kozinsky, Boris Pizzi, Giovanni Sci Data Article The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA’s workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with external simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible. Nature Publishing Group UK 2020-09-08 /pmc/articles/PMC7479590/ /pubmed/32901044 http://dx.doi.org/10.1038/s41597-020-00638-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Huber, Sebastiaan P. Zoupanos, Spyros Uhrin, Martin Talirz, Leopold Kahle, Leonid Häuselmann, Rico Gresch, Dominik Müller, Tiziano Yakutovich, Aliaksandr V. Andersen, Casper W. Ramirez, Francisco F. Adorf, Carl S. Gargiulo, Fernando Kumbhar, Snehal Passaro, Elsa Johnston, Conrad Merkys, Andrius Cepellotti, Andrea Mounet, Nicolas Marzari, Nicola Kozinsky, Boris Pizzi, Giovanni AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title | AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title_full | AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title_fullStr | AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title_full_unstemmed | AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title_short | AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
title_sort | aiida 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479590/ https://www.ncbi.nlm.nih.gov/pubmed/32901044 http://dx.doi.org/10.1038/s41597-020-00638-4 |
work_keys_str_mv | AT hubersebastiaanp aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT zoupanosspyros aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT uhrinmartin aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT talirzleopold aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT kahleleonid aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT hauselmannrico aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT greschdominik aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT mullertiziano aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT yakutovichaliaksandrv aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT andersencasperw aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT ramirezfranciscof aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT adorfcarls aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT gargiulofernando aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT kumbharsnehal aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT passaroelsa aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT johnstonconrad aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT merkysandrius aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT cepellottiandrea aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT mounetnicolas aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT marzarinicola aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT kozinskyboris aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance AT pizzigiovanni aiida10ascalablecomputationalinfrastructureforautomatedreproducibleworkflowsanddataprovenance |