Cargando…

Assessment of clustering techniques to support the analyses of soybean seed vigor

Soy is the main product of Brazilian agriculture and the fourth most cultivated bean globally. Since soy cultivation tends to increase and due to this large market, the guarantee of product quality is an indispensable factor for enterprises to stay competitive. Industries perform vigor tests to acqu...

Descripción completa

Detalles Bibliográficos
Autores principales: de Oliveira, Eduardo R., Bugatti, Pedro H., Saito, Priscila T. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10456222/
https://www.ncbi.nlm.nih.gov/pubmed/37624819
http://dx.doi.org/10.1371/journal.pone.0285566
Descripción
Sumario:Soy is the main product of Brazilian agriculture and the fourth most cultivated bean globally. Since soy cultivation tends to increase and due to this large market, the guarantee of product quality is an indispensable factor for enterprises to stay competitive. Industries perform vigor tests to acquire information and evaluate the quality of soy planting. The tetrazolium test, for example, provides information about moisture damage, bedbugs, or mechanical damage. However, the verification of the damage reason and its severity are done by an analyst, one by one. Since this is massive and exhausting work, it is susceptible to mistakes. Proposals involving different supervised learning approaches, including active learning strategies, have already been used, and have brought significant results. Therefore, this paper analyzes the performance of non-supervised techniques for classifying soybeans. An extensive experimental evaluation was performed, considering (9) different clustering algorithms (partitional, hierarchical, and density-based) applied to 5 image datasets of soybean seeds submitted to the tetrazolium test, including different damages and/or their levels. To describe those images, we considered 18 extractors of traditional features. We also considered four metrics (accuracy, FOWLKES, DAVIES, and CALINSKI) and two-dimensionality reduction techniques (principal component analysis and t-distributed stochastic neighbor embedding) for validation. Results show that this paper presents essential contributions since it makes it possible to identify descriptors and clustering algorithms that shall be used as preprocessing in other learning processes, accelerating and improving the classification process of key agricultural problems.