Cargando…

Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon

In the last decade, there has been an increasing demand for wild-captured fish, which attains higher prices compared to farmed species, thus being prone to mislabeling practices. In this work, fatty acid composition coupled to advanced chemometrics was used to discriminate wild from farmed salmon. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Grazina, Liliana, Rodrigues, P. J., Igrejas, Getúlio, Nunes, Maria A., Mafra, Isabel, Arlorio, Marco, Oliveira, M. Beatriz P. P., Amaral, Joana S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7695029/
https://www.ncbi.nlm.nih.gov/pubmed/33171721
http://dx.doi.org/10.3390/foods9111622
_version_ 1783615106506031104
author Grazina, Liliana
Rodrigues, P. J.
Igrejas, Getúlio
Nunes, Maria A.
Mafra, Isabel
Arlorio, Marco
Oliveira, M. Beatriz P. P.
Amaral, Joana S.
author_facet Grazina, Liliana
Rodrigues, P. J.
Igrejas, Getúlio
Nunes, Maria A.
Mafra, Isabel
Arlorio, Marco
Oliveira, M. Beatriz P. P.
Amaral, Joana S.
author_sort Grazina, Liliana
collection PubMed
description In the last decade, there has been an increasing demand for wild-captured fish, which attains higher prices compared to farmed species, thus being prone to mislabeling practices. In this work, fatty acid composition coupled to advanced chemometrics was used to discriminate wild from farmed salmon. The lipids extracted from salmon muscles of different production methods and origins (26 wild from Canada, 25 farmed from Canada, 24 farmed from Chile and 25 farmed from Norway) were analyzed by gas chromatography with flame ionization detector (GC-FID). All the tested chemometric approaches, namely principal components analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and seven machine learning classifiers, namely k-nearest neighbors (kNN), decision tree, support vector machine (SVM), random forest, artificial neural networks (ANN), naïve Bayes and AdaBoost, allowed for differentiation between farmed and wild salmons using the 17 features obtained from chemical analysis. PCA did not allow clear distinguishing between salmon geographical origin since farmed samples from Canada and Chile overlapped. Nevertheless, using the 17 features in the models, six out of the seven tested machine learning classifiers allowed a classification accuracy of ≥99%, with ANN, naïve Bayes, random forest, SVM and kNN presenting 100% accuracy on the test dataset. The classification models were also assayed using only the best features selected by a reduction algorithm and the best input features mapped by t-SNE. The classifier kNN provided the best discrimination results because it correctly classified all samples according to production method and origin, ultimately using only the three most important features (16:0, 18:2n6c and 20:3n3 + 20:4n6). In general, the classifiers presented good generalization with the herein proposed approach being simple and presenting the advantage of requiring only common equipment existing in most labs.
format Online
Article
Text
id pubmed-7695029
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-76950292020-11-28 Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon Grazina, Liliana Rodrigues, P. J. Igrejas, Getúlio Nunes, Maria A. Mafra, Isabel Arlorio, Marco Oliveira, M. Beatriz P. P. Amaral, Joana S. Foods Article In the last decade, there has been an increasing demand for wild-captured fish, which attains higher prices compared to farmed species, thus being prone to mislabeling practices. In this work, fatty acid composition coupled to advanced chemometrics was used to discriminate wild from farmed salmon. The lipids extracted from salmon muscles of different production methods and origins (26 wild from Canada, 25 farmed from Canada, 24 farmed from Chile and 25 farmed from Norway) were analyzed by gas chromatography with flame ionization detector (GC-FID). All the tested chemometric approaches, namely principal components analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE) and seven machine learning classifiers, namely k-nearest neighbors (kNN), decision tree, support vector machine (SVM), random forest, artificial neural networks (ANN), naïve Bayes and AdaBoost, allowed for differentiation between farmed and wild salmons using the 17 features obtained from chemical analysis. PCA did not allow clear distinguishing between salmon geographical origin since farmed samples from Canada and Chile overlapped. Nevertheless, using the 17 features in the models, six out of the seven tested machine learning classifiers allowed a classification accuracy of ≥99%, with ANN, naïve Bayes, random forest, SVM and kNN presenting 100% accuracy on the test dataset. The classification models were also assayed using only the best features selected by a reduction algorithm and the best input features mapped by t-SNE. The classifier kNN provided the best discrimination results because it correctly classified all samples according to production method and origin, ultimately using only the three most important features (16:0, 18:2n6c and 20:3n3 + 20:4n6). In general, the classifiers presented good generalization with the herein proposed approach being simple and presenting the advantage of requiring only common equipment existing in most labs. MDPI 2020-11-07 /pmc/articles/PMC7695029/ /pubmed/33171721 http://dx.doi.org/10.3390/foods9111622 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Grazina, Liliana
Rodrigues, P. J.
Igrejas, Getúlio
Nunes, Maria A.
Mafra, Isabel
Arlorio, Marco
Oliveira, M. Beatriz P. P.
Amaral, Joana S.
Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title_full Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title_fullStr Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title_full_unstemmed Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title_short Machine Learning Approaches Applied to GC-FID Fatty Acid Profiles to Discriminate Wild from Farmed Salmon
title_sort machine learning approaches applied to gc-fid fatty acid profiles to discriminate wild from farmed salmon
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7695029/
https://www.ncbi.nlm.nih.gov/pubmed/33171721
http://dx.doi.org/10.3390/foods9111622
work_keys_str_mv AT grazinaliliana machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT rodriguespj machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT igrejasgetulio machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT nunesmariaa machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT mafraisabel machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT arloriomarco machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT oliveirambeatrizpp machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon
AT amaraljoanas machinelearningapproachesappliedtogcfidfattyacidprofilestodiscriminatewildfromfarmedsalmon