Cargando…

Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review

Machine learning (ML) and deep learning (DL) have achieved great success in different tasks. These include computer vision, image segmentation, natural language processing, predicting classification, evaluating time series, and predicting values based on a series of variables. As artificial intellig...

Descripción completa

Detalles Bibliográficos
Autores principales: Gracia Moisés, Ander, Vitoria Pascual, Ignacio, Imas González, José Javier, Ruiz Zamarreño, Carlos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10610871/
https://www.ncbi.nlm.nih.gov/pubmed/37896655
http://dx.doi.org/10.3390/s23208562
_version_ 1785128358700384256
author Gracia Moisés, Ander
Vitoria Pascual, Ignacio
Imas González, José Javier
Ruiz Zamarreño, Carlos
author_facet Gracia Moisés, Ander
Vitoria Pascual, Ignacio
Imas González, José Javier
Ruiz Zamarreño, Carlos
author_sort Gracia Moisés, Ander
collection PubMed
description Machine learning (ML) and deep learning (DL) have achieved great success in different tasks. These include computer vision, image segmentation, natural language processing, predicting classification, evaluating time series, and predicting values based on a series of variables. As artificial intelligence progresses, new techniques are being applied to areas like optical spectroscopy and its uses in specific fields, such as the agrifood industry. The performance of ML and DL techniques generally improves with the amount of data available. However, it is not always possible to obtain all the necessary data for creating a robust dataset. In the particular case of agrifood applications, dataset collection is generally constrained to specific periods. Weather conditions can also reduce the possibility to cover the entire range of classifications with the consequent generation of imbalanced datasets. To address this issue, data augmentation (DA) techniques are employed to expand the dataset by adding slightly modified copies of existing data. This leads to a dataset that includes values from laboratory tests, as well as a collection of synthetic data based on the real data. This review work will present the application of DA techniques to optical spectroscopy datasets obtained from real agrifood industry applications. The reviewed methods will describe the use of simple DA techniques, such as duplicating samples with slight changes, as well as the utilization of more complex algorithms based on deep learning generative adversarial networks (GANs), and semi-supervised generative adversarial networks (SGANs).
format Online
Article
Text
id pubmed-10610871
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106108712023-10-28 Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review Gracia Moisés, Ander Vitoria Pascual, Ignacio Imas González, José Javier Ruiz Zamarreño, Carlos Sensors (Basel) Review Machine learning (ML) and deep learning (DL) have achieved great success in different tasks. These include computer vision, image segmentation, natural language processing, predicting classification, evaluating time series, and predicting values based on a series of variables. As artificial intelligence progresses, new techniques are being applied to areas like optical spectroscopy and its uses in specific fields, such as the agrifood industry. The performance of ML and DL techniques generally improves with the amount of data available. However, it is not always possible to obtain all the necessary data for creating a robust dataset. In the particular case of agrifood applications, dataset collection is generally constrained to specific periods. Weather conditions can also reduce the possibility to cover the entire range of classifications with the consequent generation of imbalanced datasets. To address this issue, data augmentation (DA) techniques are employed to expand the dataset by adding slightly modified copies of existing data. This leads to a dataset that includes values from laboratory tests, as well as a collection of synthetic data based on the real data. This review work will present the application of DA techniques to optical spectroscopy datasets obtained from real agrifood industry applications. The reviewed methods will describe the use of simple DA techniques, such as duplicating samples with slight changes, as well as the utilization of more complex algorithms based on deep learning generative adversarial networks (GANs), and semi-supervised generative adversarial networks (SGANs). MDPI 2023-10-18 /pmc/articles/PMC10610871/ /pubmed/37896655 http://dx.doi.org/10.3390/s23208562 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Gracia Moisés, Ander
Vitoria Pascual, Ignacio
Imas González, José Javier
Ruiz Zamarreño, Carlos
Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title_full Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title_fullStr Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title_full_unstemmed Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title_short Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review
title_sort data augmentation techniques for machine learning applied to optical spectroscopy datasets in agrifood applications: a comprehensive review
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10610871/
https://www.ncbi.nlm.nih.gov/pubmed/37896655
http://dx.doi.org/10.3390/s23208562
work_keys_str_mv AT graciamoisesander dataaugmentationtechniquesformachinelearningappliedtoopticalspectroscopydatasetsinagrifoodapplicationsacomprehensivereview
AT vitoriapascualignacio dataaugmentationtechniquesformachinelearningappliedtoopticalspectroscopydatasetsinagrifoodapplicationsacomprehensivereview
AT imasgonzalezjosejavier dataaugmentationtechniquesformachinelearningappliedtoopticalspectroscopydatasetsinagrifoodapplicationsacomprehensivereview
AT ruizzamarrenocarlos dataaugmentationtechniquesformachinelearningappliedtoopticalspectroscopydatasetsinagrifoodapplicationsacomprehensivereview