Cargando…

Distribution-preserving data augmentation

In the last decade, deep learning has been applied in a wide range of problems with tremendous success. This success mainly comes from large data availability, increased computational power, and theoretical improvements in the training phase. As the dataset grows, the real world is better represente...

Descripción completa

Detalles Bibliográficos
Autores principales:	Saran, Nurdan Ayse, Saran, Murat, Nar, Fatih
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176531/ https://www.ncbi.nlm.nih.gov/pubmed/34141893 http://dx.doi.org/10.7717/peerj-cs.571

_version_	1783703273974267904
author	Saran, Nurdan Ayse Saran, Murat Nar, Fatih
author_facet	Saran, Nurdan Ayse Saran, Murat Nar, Fatih
author_sort	Saran, Nurdan Ayse
collection	PubMed
description	In the last decade, deep learning has been applied in a wide range of problems with tremendous success. This success mainly comes from large data availability, increased computational power, and theoretical improvements in the training phase. As the dataset grows, the real world is better represented, making it possible to develop a model that can generalize. However, creating a labeled dataset is expensive, time-consuming, and sometimes not likely in some domains if not challenging. Therefore, researchers proposed data augmentation methods to increase dataset size and variety by creating variations of the existing data. For image data, variations can be obtained by applying color or spatial transformations, only one or a combination. Such color transformations perform some linear or nonlinear operations in the entire image or in the patches to create variations of the original image. The current color-based augmentation methods are usually based on image processing methods that apply color transformations such as equalizing, solarizing, and posterizing. Nevertheless, these color-based data augmentation methods do not guarantee to create plausible variations of the image. This paper proposes a novel distribution-preserving data augmentation method that creates plausible image variations by shifting pixel colors to another point in the image color distribution. We achieved this by defining a regularized density decreasing direction to create paths from the original pixels’ color to the distribution tails. The proposed method provides superior performance compared to existing data augmentation methods which is shown using a transfer learning scenario on the UC Merced Land-use, Intel Image Classification, and Oxford-IIIT Pet datasets for classification and segmentation tasks.
format	Online Article Text
id	pubmed-8176531
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-81765312021-06-16 Distribution-preserving data augmentation Saran, Nurdan Ayse Saran, Murat Nar, Fatih PeerJ Comput Sci Artificial Intelligence In the last decade, deep learning has been applied in a wide range of problems with tremendous success. This success mainly comes from large data availability, increased computational power, and theoretical improvements in the training phase. As the dataset grows, the real world is better represented, making it possible to develop a model that can generalize. However, creating a labeled dataset is expensive, time-consuming, and sometimes not likely in some domains if not challenging. Therefore, researchers proposed data augmentation methods to increase dataset size and variety by creating variations of the existing data. For image data, variations can be obtained by applying color or spatial transformations, only one or a combination. Such color transformations perform some linear or nonlinear operations in the entire image or in the patches to create variations of the original image. The current color-based augmentation methods are usually based on image processing methods that apply color transformations such as equalizing, solarizing, and posterizing. Nevertheless, these color-based data augmentation methods do not guarantee to create plausible variations of the image. This paper proposes a novel distribution-preserving data augmentation method that creates plausible image variations by shifting pixel colors to another point in the image color distribution. We achieved this by defining a regularized density decreasing direction to create paths from the original pixels’ color to the distribution tails. The proposed method provides superior performance compared to existing data augmentation methods which is shown using a transfer learning scenario on the UC Merced Land-use, Intel Image Classification, and Oxford-IIIT Pet datasets for classification and segmentation tasks. PeerJ Inc. 2021-05-27 /pmc/articles/PMC8176531/ /pubmed/34141893 http://dx.doi.org/10.7717/peerj-cs.571 Text en © 2021 Saran et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Artificial Intelligence Saran, Nurdan Ayse Saran, Murat Nar, Fatih Distribution-preserving data augmentation
title	Distribution-preserving data augmentation
title_full	Distribution-preserving data augmentation
title_fullStr	Distribution-preserving data augmentation
title_full_unstemmed	Distribution-preserving data augmentation
title_short	Distribution-preserving data augmentation
title_sort	distribution-preserving data augmentation
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176531/ https://www.ncbi.nlm.nih.gov/pubmed/34141893 http://dx.doi.org/10.7717/peerj-cs.571
work_keys_str_mv	AT sarannurdanayse distributionpreservingdataaugmentation AT saranmurat distributionpreservingdataaugmentation AT narfatih distributionpreservingdataaugmentation

Distribution-preserving data augmentation

Ejemplares similares