Cargando…

Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Koehler Leman, Julia, Lyskov, Sergey, Lewis, Steven M., Adolf-Bryfogle, Jared, Alford, Rebecca F., Barlow, Kyle, Ben-Aharon, Ziv, Farrell, Daniel, Fell, Jason, Hansen, William A., Harmalkar, Ameya, Jeliazkov, Jeliazko, Kuenze, Georg, Krys, Justyna D., Ljubetič, Ajasja, Loshbaugh, Amanda L., Maguire, Jack, Moretti, Rocco, Mulligan, Vikram Khipple, Nance, Morgan L., Nguyen, Phuong T., Ó Conchúir, Shane, Roy Burman, Shourya S., Samanta, Rituparna, Smith, Shannon T., Teets, Frank, Tiemann, Johanna K. S., Watkins, Andrew, Woods, Hope, Yachnin, Brahm J., Bahl, Christopher D., Bailey-Kellogg, Chris, Baker, David, Das, Rhiju, DiMaio, Frank, Khare, Sagar D., Kortemme, Tanja, Labonte, Jason W., Lindorff-Larsen, Kresten, Meiler, Jens, Schief, William, Schueler-Furman, Ora, Siegel, Justin B., Stein, Amelie, Yarov-Yarovoy, Vladimir, Kuhlman, Brian, Leaver-Fay, Andrew, Gront, Dominik, Gray, Jeffrey J., Bonneau, Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8630030/
https://www.ncbi.nlm.nih.gov/pubmed/34845212
http://dx.doi.org/10.1038/s41467-021-27222-7
_version_ 1784607323591802880
author Koehler Leman, Julia
Lyskov, Sergey
Lewis, Steven M.
Adolf-Bryfogle, Jared
Alford, Rebecca F.
Barlow, Kyle
Ben-Aharon, Ziv
Farrell, Daniel
Fell, Jason
Hansen, William A.
Harmalkar, Ameya
Jeliazkov, Jeliazko
Kuenze, Georg
Krys, Justyna D.
Ljubetič, Ajasja
Loshbaugh, Amanda L.
Maguire, Jack
Moretti, Rocco
Mulligan, Vikram Khipple
Nance, Morgan L.
Nguyen, Phuong T.
Ó Conchúir, Shane
Roy Burman, Shourya S.
Samanta, Rituparna
Smith, Shannon T.
Teets, Frank
Tiemann, Johanna K. S.
Watkins, Andrew
Woods, Hope
Yachnin, Brahm J.
Bahl, Christopher D.
Bailey-Kellogg, Chris
Baker, David
Das, Rhiju
DiMaio, Frank
Khare, Sagar D.
Kortemme, Tanja
Labonte, Jason W.
Lindorff-Larsen, Kresten
Meiler, Jens
Schief, William
Schueler-Furman, Ora
Siegel, Justin B.
Stein, Amelie
Yarov-Yarovoy, Vladimir
Kuhlman, Brian
Leaver-Fay, Andrew
Gront, Dominik
Gray, Jeffrey J.
Bonneau, Richard
author_facet Koehler Leman, Julia
Lyskov, Sergey
Lewis, Steven M.
Adolf-Bryfogle, Jared
Alford, Rebecca F.
Barlow, Kyle
Ben-Aharon, Ziv
Farrell, Daniel
Fell, Jason
Hansen, William A.
Harmalkar, Ameya
Jeliazkov, Jeliazko
Kuenze, Georg
Krys, Justyna D.
Ljubetič, Ajasja
Loshbaugh, Amanda L.
Maguire, Jack
Moretti, Rocco
Mulligan, Vikram Khipple
Nance, Morgan L.
Nguyen, Phuong T.
Ó Conchúir, Shane
Roy Burman, Shourya S.
Samanta, Rituparna
Smith, Shannon T.
Teets, Frank
Tiemann, Johanna K. S.
Watkins, Andrew
Woods, Hope
Yachnin, Brahm J.
Bahl, Christopher D.
Bailey-Kellogg, Chris
Baker, David
Das, Rhiju
DiMaio, Frank
Khare, Sagar D.
Kortemme, Tanja
Labonte, Jason W.
Lindorff-Larsen, Kresten
Meiler, Jens
Schief, William
Schueler-Furman, Ora
Siegel, Justin B.
Stein, Amelie
Yarov-Yarovoy, Vladimir
Kuhlman, Brian
Leaver-Fay, Andrew
Gront, Dominik
Gray, Jeffrey J.
Bonneau, Richard
author_sort Koehler Leman, Julia
collection PubMed
description Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.
format Online
Article
Text
id pubmed-8630030
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-86300302021-12-01 Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks Koehler Leman, Julia Lyskov, Sergey Lewis, Steven M. Adolf-Bryfogle, Jared Alford, Rebecca F. Barlow, Kyle Ben-Aharon, Ziv Farrell, Daniel Fell, Jason Hansen, William A. Harmalkar, Ameya Jeliazkov, Jeliazko Kuenze, Georg Krys, Justyna D. Ljubetič, Ajasja Loshbaugh, Amanda L. Maguire, Jack Moretti, Rocco Mulligan, Vikram Khipple Nance, Morgan L. Nguyen, Phuong T. Ó Conchúir, Shane Roy Burman, Shourya S. Samanta, Rituparna Smith, Shannon T. Teets, Frank Tiemann, Johanna K. S. Watkins, Andrew Woods, Hope Yachnin, Brahm J. Bahl, Christopher D. Bailey-Kellogg, Chris Baker, David Das, Rhiju DiMaio, Frank Khare, Sagar D. Kortemme, Tanja Labonte, Jason W. Lindorff-Larsen, Kresten Meiler, Jens Schief, William Schueler-Furman, Ora Siegel, Justin B. Stein, Amelie Yarov-Yarovoy, Vladimir Kuhlman, Brian Leaver-Fay, Andrew Gront, Dominik Gray, Jeffrey J. Bonneau, Richard Nat Commun Article Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours. Nature Publishing Group UK 2021-11-29 /pmc/articles/PMC8630030/ /pubmed/34845212 http://dx.doi.org/10.1038/s41467-021-27222-7 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Koehler Leman, Julia
Lyskov, Sergey
Lewis, Steven M.
Adolf-Bryfogle, Jared
Alford, Rebecca F.
Barlow, Kyle
Ben-Aharon, Ziv
Farrell, Daniel
Fell, Jason
Hansen, William A.
Harmalkar, Ameya
Jeliazkov, Jeliazko
Kuenze, Georg
Krys, Justyna D.
Ljubetič, Ajasja
Loshbaugh, Amanda L.
Maguire, Jack
Moretti, Rocco
Mulligan, Vikram Khipple
Nance, Morgan L.
Nguyen, Phuong T.
Ó Conchúir, Shane
Roy Burman, Shourya S.
Samanta, Rituparna
Smith, Shannon T.
Teets, Frank
Tiemann, Johanna K. S.
Watkins, Andrew
Woods, Hope
Yachnin, Brahm J.
Bahl, Christopher D.
Bailey-Kellogg, Chris
Baker, David
Das, Rhiju
DiMaio, Frank
Khare, Sagar D.
Kortemme, Tanja
Labonte, Jason W.
Lindorff-Larsen, Kresten
Meiler, Jens
Schief, William
Schueler-Furman, Ora
Siegel, Justin B.
Stein, Amelie
Yarov-Yarovoy, Vladimir
Kuhlman, Brian
Leaver-Fay, Andrew
Gront, Dominik
Gray, Jeffrey J.
Bonneau, Richard
Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title_full Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title_fullStr Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title_full_unstemmed Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title_short Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
title_sort ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8630030/
https://www.ncbi.nlm.nih.gov/pubmed/34845212
http://dx.doi.org/10.1038/s41467-021-27222-7
work_keys_str_mv AT koehlerlemanjulia ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT lyskovsergey ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT lewisstevenm ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT adolfbryfoglejared ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT alfordrebeccaf ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT barlowkyle ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT benaharonziv ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT farrelldaniel ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT felljason ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT hansenwilliama ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT harmalkarameya ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT jeliazkovjeliazko ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT kuenzegeorg ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT krysjustynad ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT ljubeticajasja ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT loshbaughamandal ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT maguirejack ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT morettirocco ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT mulliganvikramkhipple ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT nancemorganl ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT nguyenphuongt ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT oconchuirshane ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT royburmanshouryas ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT samantarituparna ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT smithshannont ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT teetsfrank ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT tiemannjohannaks ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT watkinsandrew ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT woodshope ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT yachninbrahmj ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT bahlchristopherd ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT baileykelloggchris ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT bakerdavid ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT dasrhiju ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT dimaiofrank ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT kharesagard ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT kortemmetanja ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT labontejasonw ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT lindorfflarsenkresten ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT meilerjens ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT schiefwilliam ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT schuelerfurmanora ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT siegeljustinb ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT steinamelie ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT yarovyarovoyvladimir ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT kuhlmanbrian ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT leaverfayandrew ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT grontdominik ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT grayjeffreyj ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks
AT bonneaurichard ensuringscientificreproducibilityinbiomacromolecularmodelingviaextensiveautomatedbenchmarks