Cargando…
Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics
Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal compon...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9065030/ https://www.ncbi.nlm.nih.gov/pubmed/35505067 http://dx.doi.org/10.1038/s41598-022-11258-w |
_version_ | 1784699495346339840 |
---|---|
author | Salifu, Daisy Ibrahim, Eric Ali Tonnang, Henri E. Z. |
author_facet | Salifu, Daisy Ibrahim, Eric Ali Tonnang, Henri E. Z. |
author_sort | Salifu, Daisy |
collection | PubMed |
description | Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than “no-information rate” (NIR) (p value > 0.1). The SVM models had a predictive accuracy of > 95%, significantly higher than NIR (p < 0.001), Kappa > 0.78 and area under curve (AUC) of the receiver operating characteristics was > 0.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured. |
format | Online Article Text |
id | pubmed-9065030 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-90650302022-05-04 Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics Salifu, Daisy Ibrahim, Eric Ali Tonnang, Henri E. Z. Sci Rep Article Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than “no-information rate” (NIR) (p value > 0.1). The SVM models had a predictive accuracy of > 95%, significantly higher than NIR (p < 0.001), Kappa > 0.78 and area under curve (AUC) of the receiver operating characteristics was > 0.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured. Nature Publishing Group UK 2022-05-03 /pmc/articles/PMC9065030/ /pubmed/35505067 http://dx.doi.org/10.1038/s41598-022-11258-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Salifu, Daisy Ibrahim, Eric Ali Tonnang, Henri E. Z. Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title | Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title_full | Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title_fullStr | Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title_full_unstemmed | Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title_short | Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
title_sort | leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9065030/ https://www.ncbi.nlm.nih.gov/pubmed/35505067 http://dx.doi.org/10.1038/s41598-022-11258-w |
work_keys_str_mv | AT salifudaisy leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics AT ibrahimericali leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics AT tonnanghenriez leveragingmachinelearningtoolsandalgorithmsforanalysisoffruitflymorphometrics |