Cargando…

Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys

Unifloral honeys are highly demanded by honey consumers, especially in Europe. To ensure that a honey belongs to a very appreciated botanical class, the classical methodology is palynological analysis to identify and count pollen grains. Highly trained personnel are needed to perform this task, whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Mateo, Fernando, Tarazona, Andrea, Mateo, Eva María
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8303996/
https://www.ncbi.nlm.nih.gov/pubmed/34359412
http://dx.doi.org/10.3390/foods10071543
_version_ 1783727226041139200
author Mateo, Fernando
Tarazona, Andrea
Mateo, Eva María
author_facet Mateo, Fernando
Tarazona, Andrea
Mateo, Eva María
author_sort Mateo, Fernando
collection PubMed
description Unifloral honeys are highly demanded by honey consumers, especially in Europe. To ensure that a honey belongs to a very appreciated botanical class, the classical methodology is palynological analysis to identify and count pollen grains. Highly trained personnel are needed to perform this task, which complicates the characterization of honey botanical origins. Organoleptic assessment of honey by expert personnel helps to confirm such classification. In this study, the ability of different machine learning (ML) algorithms to correctly classify seven types of Spanish honeys of single botanical origins (rosemary, citrus, lavender, sunflower, eucalyptus, heather and forest honeydew) was investigated comparatively. The botanical origin of the samples was ascertained by pollen analysis complemented with organoleptic assessment. Physicochemical parameters such as electrical conductivity, pH, water content, carbohydrates and color of unifloral honeys were used to build the dataset. The following ML algorithms were tested: penalized discriminant analysis (PDA), shrinkage discriminant analysis (SDA), high-dimensional discriminant analysis (HDDA), nearest shrunken centroids (PAM), partial least squares (PLS), C5.0 tree, extremely randomized trees (ET), weighted k-nearest neighbors (KKNN), artificial neural networks (ANN), random forest (RF), support vector machine (SVM) with linear and radial kernels and extreme gradient boosting trees (XGBoost). The ML models were optimized by repeated 10-fold cross-validation primarily on the basis of log loss or accuracy metrics, and their performance was compared on a test set in order to select the best predicting model. Built models using PDA produced the best results in terms of overall accuracy on the test set. ANN, ET, RF and XGBoost models also provided good results, while SVM proved to be the worst.
format Online
Article
Text
id pubmed-8303996
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83039962021-07-25 Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys Mateo, Fernando Tarazona, Andrea Mateo, Eva María Foods Article Unifloral honeys are highly demanded by honey consumers, especially in Europe. To ensure that a honey belongs to a very appreciated botanical class, the classical methodology is palynological analysis to identify and count pollen grains. Highly trained personnel are needed to perform this task, which complicates the characterization of honey botanical origins. Organoleptic assessment of honey by expert personnel helps to confirm such classification. In this study, the ability of different machine learning (ML) algorithms to correctly classify seven types of Spanish honeys of single botanical origins (rosemary, citrus, lavender, sunflower, eucalyptus, heather and forest honeydew) was investigated comparatively. The botanical origin of the samples was ascertained by pollen analysis complemented with organoleptic assessment. Physicochemical parameters such as electrical conductivity, pH, water content, carbohydrates and color of unifloral honeys were used to build the dataset. The following ML algorithms were tested: penalized discriminant analysis (PDA), shrinkage discriminant analysis (SDA), high-dimensional discriminant analysis (HDDA), nearest shrunken centroids (PAM), partial least squares (PLS), C5.0 tree, extremely randomized trees (ET), weighted k-nearest neighbors (KKNN), artificial neural networks (ANN), random forest (RF), support vector machine (SVM) with linear and radial kernels and extreme gradient boosting trees (XGBoost). The ML models were optimized by repeated 10-fold cross-validation primarily on the basis of log loss or accuracy metrics, and their performance was compared on a test set in order to select the best predicting model. Built models using PDA produced the best results in terms of overall accuracy on the test set. ANN, ET, RF and XGBoost models also provided good results, while SVM proved to be the worst. MDPI 2021-07-03 /pmc/articles/PMC8303996/ /pubmed/34359412 http://dx.doi.org/10.3390/foods10071543 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Mateo, Fernando
Tarazona, Andrea
Mateo, Eva María
Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title_full Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title_fullStr Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title_full_unstemmed Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title_short Comparative Study of Several Machine Learning Algorithms for Classification of Unifloral Honeys
title_sort comparative study of several machine learning algorithms for classification of unifloral honeys
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8303996/
https://www.ncbi.nlm.nih.gov/pubmed/34359412
http://dx.doi.org/10.3390/foods10071543
work_keys_str_mv AT mateofernando comparativestudyofseveralmachinelearningalgorithmsforclassificationofunifloralhoneys
AT tarazonaandrea comparativestudyofseveralmachinelearningalgorithmsforclassificationofunifloralhoneys
AT mateoevamaria comparativestudyofseveralmachinelearningalgorithmsforclassificationofunifloralhoneys