Cargando…

High frequency accuracy and loss data of random neural networks trained on image datasets

Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rorabaugh, Ariel Keller, Caíno-Lores, Silvina, Johnston, Travis, Taufer, Michela
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2022
Materias:	Data Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8749157/ https://www.ncbi.nlm.nih.gov/pubmed/35036484 http://dx.doi.org/10.1016/j.dib.2021.107780

_version_	1784631167738183680
author	Rorabaugh, Ariel Keller Caíno-Lores, Silvina Johnston, Travis Taufer, Michela
author_facet	Rorabaugh, Ariel Keller Caíno-Lores, Silvina Johnston, Travis Taufer, Michela
author_sort	Rorabaugh, Ariel Keller
collection	PubMed
description	Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from the environment through a learning process and stores this knowledge in its connections. The learning process is conducted by training. During NN training, the learning process can be tracked by periodically validating the NN and calculating its fitness. The resulting sequence of fitness values (i.e., validation accuracy or validation loss) is called the NN learning curve. The development of tools for NN design requires knowledge of diverse NNs and their complete learning curves. Generally, only final fully-trained fitness values for highly accurate NNs are made available to the community, hampering efforts to develop tools for NN design and leaving unaddressed aspects such as explaining the generation of an NN and reproducing its learning process. Our dataset fills this gap by fully recording the structure, metadata, and complete learning curves for a wide variety of random NNs throughout their training. Our dataset captures the lifespan of 6000 NNs throughout generation, training, and validation stages. It consists of a suite of 6000 tables, each table representing the lifespan of one NN. We generate each NN with randomized parameter values and train it for 40 epochs on one of three diverse image datasets (i.e., CIFAR-100, FashionMNIST, SVHN). We calculate and record each NN’s fitness with high frequency—every half epoch—to capture the evolution of the training and validation process. As a result, for each NN, we record the generated parameter values describing the structure of that NN, the image dataset on which the NN trained, and all loss and accuracy values for the NN every half epoch. We put our dataset to the service of researchers studying NN performance and its evolution throughout training and validation. Statistical methods can be applied to our dataset to analyze the shape of learning curves in diverse NNs, and the relationship between an NN’s structure and its fitness. Additionally, the structural data and metadata that we record enable the reconstruction and reproducibility of the associated NN.
format	Online Article Text
id	pubmed-8749157
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-87491572022-01-13 High frequency accuracy and loss data of random neural networks trained on image datasets Rorabaugh, Ariel Keller Caíno-Lores, Silvina Johnston, Travis Taufer, Michela Data Brief Data Article Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from the environment through a learning process and stores this knowledge in its connections. The learning process is conducted by training. During NN training, the learning process can be tracked by periodically validating the NN and calculating its fitness. The resulting sequence of fitness values (i.e., validation accuracy or validation loss) is called the NN learning curve. The development of tools for NN design requires knowledge of diverse NNs and their complete learning curves. Generally, only final fully-trained fitness values for highly accurate NNs are made available to the community, hampering efforts to develop tools for NN design and leaving unaddressed aspects such as explaining the generation of an NN and reproducing its learning process. Our dataset fills this gap by fully recording the structure, metadata, and complete learning curves for a wide variety of random NNs throughout their training. Our dataset captures the lifespan of 6000 NNs throughout generation, training, and validation stages. It consists of a suite of 6000 tables, each table representing the lifespan of one NN. We generate each NN with randomized parameter values and train it for 40 epochs on one of three diverse image datasets (i.e., CIFAR-100, FashionMNIST, SVHN). We calculate and record each NN’s fitness with high frequency—every half epoch—to capture the evolution of the training and validation process. As a result, for each NN, we record the generated parameter values describing the structure of that NN, the image dataset on which the NN trained, and all loss and accuracy values for the NN every half epoch. We put our dataset to the service of researchers studying NN performance and its evolution throughout training and validation. Statistical methods can be applied to our dataset to analyze the shape of learning curves in diverse NNs, and the relationship between an NN’s structure and its fitness. Additionally, the structural data and metadata that we record enable the reconstruction and reproducibility of the associated NN. Elsevier 2022-01-05 /pmc/articles/PMC8749157/ /pubmed/35036484 http://dx.doi.org/10.1016/j.dib.2021.107780 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Data Article Rorabaugh, Ariel Keller Caíno-Lores, Silvina Johnston, Travis Taufer, Michela High frequency accuracy and loss data of random neural networks trained on image datasets
title	High frequency accuracy and loss data of random neural networks trained on image datasets
title_full	High frequency accuracy and loss data of random neural networks trained on image datasets
title_fullStr	High frequency accuracy and loss data of random neural networks trained on image datasets
title_full_unstemmed	High frequency accuracy and loss data of random neural networks trained on image datasets
title_short	High frequency accuracy and loss data of random neural networks trained on image datasets
title_sort	high frequency accuracy and loss data of random neural networks trained on image datasets
topic	Data Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8749157/ https://www.ncbi.nlm.nih.gov/pubmed/35036484 http://dx.doi.org/10.1016/j.dib.2021.107780
work_keys_str_mv	AT rorabaugharielkeller highfrequencyaccuracyandlossdataofrandomneuralnetworkstrainedonimagedatasets AT cainoloressilvina highfrequencyaccuracyandlossdataofrandomneuralnetworkstrainedonimagedatasets AT johnstontravis highfrequencyaccuracyandlossdataofrandomneuralnetworkstrainedonimagedatasets AT taufermichela highfrequencyaccuracyandlossdataofrandomneuralnetworkstrainedonimagedatasets

High frequency accuracy and loss data of random neural networks trained on image datasets

Ejemplares similares