Cargando…

Generative adversarial network based synthetic data training model for lightweight convolutional neural networks

Inadequate training data is a significant challenge for deep learning techniques, particularly in applications where data is difficult to get, and publicly available datasets are uncommon owing to ethical and privacy concerns. Various approaches, such as data augmentation and transfer learning, are...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rather, Ishfaq Hussain, Kumar, Sushil
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10199442/ https://www.ncbi.nlm.nih.gov/pubmed/37362646 http://dx.doi.org/10.1007/s11042-023-15747-6

_version_	1785044934717341696
author	Rather, Ishfaq Hussain Kumar, Sushil
author_facet	Rather, Ishfaq Hussain Kumar, Sushil
author_sort	Rather, Ishfaq Hussain
collection	PubMed
description	Inadequate training data is a significant challenge for deep learning techniques, particularly in applications where data is difficult to get, and publicly available datasets are uncommon owing to ethical and privacy concerns. Various approaches, such as data augmentation and transfer learning, are employed to address this problem, which help to some extent in removing this limitation. However, after a certain amount of data augmentation, the quality of the generated data stalls, and transfer learning suffers from the issue of negative transfer. This paper proposes a novel generative adversarial network-based synthetic data training (GAN-ST) model to generate synthetic data for training a lightweight convolutional neural network (CNN). An enhanced generator is proposed to quickly saturate and cover the colour space of the training distribution. The GAN-ST model is based on Deep Convolutional Generative Adversarial Network(s) (DCGAN) and Conditional Generative Adversarial Network(s) (CGAN) models, which consist of an enhanced generator. The study evaluates the accuracy of a CNN model on the MNIST and CIFAR 10 datasets using both original and synthetic data. The results revealed an impressive classifier accuracy on the MNIST dataset, achieving an accuracy of 99.38% on GAN-ST-generated synthetic training data, which is only 0.05% lower than the performance on original data-based training. The classifier performance on the CIFAR dataset is also remarkable, achieving an accuracy of 90.23%. The performance of CNN trained using GAN-ST-based synthetic data is notable, with the most considerable improvement of 0.66% and 7.06%, over a single GAN-based synthetic data training for the MNIST and CIFAR datasets, respectively. By training two GANs independently, the GAN-ST model covers different parts of the original data distribution, resulting in a more diverse and realistic training data set for the classifier. This diverse set of synthetic data, when used to train a CNN, shows better generalization to new data, leading to improved classification accuracy.
format	Online Article Text
id	pubmed-10199442
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-101994422023-05-23 Generative adversarial network based synthetic data training model for lightweight convolutional neural networks Rather, Ishfaq Hussain Kumar, Sushil Multimed Tools Appl Article Inadequate training data is a significant challenge for deep learning techniques, particularly in applications where data is difficult to get, and publicly available datasets are uncommon owing to ethical and privacy concerns. Various approaches, such as data augmentation and transfer learning, are employed to address this problem, which help to some extent in removing this limitation. However, after a certain amount of data augmentation, the quality of the generated data stalls, and transfer learning suffers from the issue of negative transfer. This paper proposes a novel generative adversarial network-based synthetic data training (GAN-ST) model to generate synthetic data for training a lightweight convolutional neural network (CNN). An enhanced generator is proposed to quickly saturate and cover the colour space of the training distribution. The GAN-ST model is based on Deep Convolutional Generative Adversarial Network(s) (DCGAN) and Conditional Generative Adversarial Network(s) (CGAN) models, which consist of an enhanced generator. The study evaluates the accuracy of a CNN model on the MNIST and CIFAR 10 datasets using both original and synthetic data. The results revealed an impressive classifier accuracy on the MNIST dataset, achieving an accuracy of 99.38% on GAN-ST-generated synthetic training data, which is only 0.05% lower than the performance on original data-based training. The classifier performance on the CIFAR dataset is also remarkable, achieving an accuracy of 90.23%. The performance of CNN trained using GAN-ST-based synthetic data is notable, with the most considerable improvement of 0.66% and 7.06%, over a single GAN-based synthetic data training for the MNIST and CIFAR datasets, respectively. By training two GANs independently, the GAN-ST model covers different parts of the original data distribution, resulting in a more diverse and realistic training data set for the classifier. This diverse set of synthetic data, when used to train a CNN, shows better generalization to new data, leading to improved classification accuracy. Springer US 2023-05-20 /pmc/articles/PMC10199442/ /pubmed/37362646 http://dx.doi.org/10.1007/s11042-023-15747-6 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Rather, Ishfaq Hussain Kumar, Sushil Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title	Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title_full	Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title_fullStr	Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title_full_unstemmed	Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title_short	Generative adversarial network based synthetic data training model for lightweight convolutional neural networks
title_sort	generative adversarial network based synthetic data training model for lightweight convolutional neural networks
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10199442/ https://www.ncbi.nlm.nih.gov/pubmed/37362646 http://dx.doi.org/10.1007/s11042-023-15747-6
work_keys_str_mv	AT ratherishfaqhussain generativeadversarialnetworkbasedsyntheticdatatrainingmodelforlightweightconvolutionalneuralnetworks AT kumarsushil generativeadversarialnetworkbasedsyntheticdatatrainingmodelforlightweightconvolutionalneuralnetworks

Generative adversarial network based synthetic data training model for lightweight convolutional neural networks

Ejemplares similares