Cargando…

dtoolAI: Reproducibility for Deep Learning

Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing app...

Descripción completa

Detalles Bibliográficos
Autores principales: Hartley, Matthew, Olsson, Tjelvar S.G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660391/
https://www.ncbi.nlm.nih.gov/pubmed/33205122
http://dx.doi.org/10.1016/j.patter.2020.100073
_version_ 1783608996406493184
author Hartley, Matthew
Olsson, Tjelvar S.G.
author_facet Hartley, Matthew
Olsson, Tjelvar S.G.
author_sort Hartley, Matthew
collection PubMed
description Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing approach used to initially generate the model, but this provenance information is usually lost during model training. To avoid a future reproducibility crisis, we need to improve our deep-learning model management. The FAIR principles for data stewardship and software/workflow implementation give excellent high-level guidance on ensuring effective reuse of data and software. We suggest some specific guidelines for the generation and use of deep-learning models in science and explain how these relate to the FAIR principles. We then present dtoolAI, a Python package that we have developed to implement these guidelines. The package implements automatic capture of provenance information during model training and simplifies model distribution.
format Online
Article
Text
id pubmed-7660391
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-76603912020-11-16 dtoolAI: Reproducibility for Deep Learning Hartley, Matthew Olsson, Tjelvar S.G. Patterns (N Y) Article Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing approach used to initially generate the model, but this provenance information is usually lost during model training. To avoid a future reproducibility crisis, we need to improve our deep-learning model management. The FAIR principles for data stewardship and software/workflow implementation give excellent high-level guidance on ensuring effective reuse of data and software. We suggest some specific guidelines for the generation and use of deep-learning models in science and explain how these relate to the FAIR principles. We then present dtoolAI, a Python package that we have developed to implement these guidelines. The package implements automatic capture of provenance information during model training and simplifies model distribution. Elsevier 2020-07-23 /pmc/articles/PMC7660391/ /pubmed/33205122 http://dx.doi.org/10.1016/j.patter.2020.100073 Text en © 2020 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Hartley, Matthew
Olsson, Tjelvar S.G.
dtoolAI: Reproducibility for Deep Learning
title dtoolAI: Reproducibility for Deep Learning
title_full dtoolAI: Reproducibility for Deep Learning
title_fullStr dtoolAI: Reproducibility for Deep Learning
title_full_unstemmed dtoolAI: Reproducibility for Deep Learning
title_short dtoolAI: Reproducibility for Deep Learning
title_sort dtoolai: reproducibility for deep learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660391/
https://www.ncbi.nlm.nih.gov/pubmed/33205122
http://dx.doi.org/10.1016/j.patter.2020.100073
work_keys_str_mv AT hartleymatthew dtoolaireproducibilityfordeeplearning
AT olssontjelvarsg dtoolaireproducibilityfordeeplearning