Cargando…
Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781146/ https://www.ncbi.nlm.nih.gov/pubmed/36559539 http://dx.doi.org/10.3390/plants11243428 |
_version_ | 1784857002390847488 |
---|---|
author | Szűgyi-Reiczigel, Zsófia Ladányi, Márta Bisztray, György Dénes Varga, Zsuzsanna Bodor-Pesti, Péter |
author_facet | Szűgyi-Reiczigel, Zsófia Ladányi, Márta Bisztray, György Dénes Varga, Zsuzsanna Bodor-Pesti, Péter |
author_sort | Szűgyi-Reiczigel, Zsófia |
collection | PubMed |
description | There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set. High numbers of investigated characters could cause redundancy, while reducing those numbers may result in data loss. Grapevine is one of the most important horticultural crops, with many cultivars in production. The characterization of the genotypes is of undeniably high importance. In this study, we analyzed a dataset of scientific and historical importance with 125 morphological traits of 97 grapevine cultivars described by Németh in 1966. However, the traits are not independent in a set of a large number of categorical traits with too few cultivars. Therefore, the number of traits was first reduced using a simple and effective algorithm to eliminate traits with redundant information content using the asymmetric measure of association Goodman and Kruskal’s λ. We reduced the number of traits from 125 to 59 without any information loss. For the classification, we applied a random forest (RF) method. In this way, 93% of the cultivars were correctly classified using only four traits of the data set. To our knowledge, only a few studies applied a trait elimination algorithm similar to ours in ampelography that can be used for other biological data sets of similar structure. The classification results give a morphological explanation to several cultivars from the Carpathian Basin, a territory where all three Vitis vinifera L. geographical groups, occidentalis, orientalis and pontica, are represented. We found that the information-loss-avoiding data reduction method we applied in our study solved the redundancy-caused interdependencies and provided a suitable dataset for classifying grapevine genotypes. For example, this method may successfully be applied in digital image analysis-based traditional morphometric investigations in ampelography. |
format | Online Article Text |
id | pubmed-9781146 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-97811462022-12-24 Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars Szűgyi-Reiczigel, Zsófia Ladányi, Márta Bisztray, György Dénes Varga, Zsuzsanna Bodor-Pesti, Péter Plants (Basel) Article There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set. High numbers of investigated characters could cause redundancy, while reducing those numbers may result in data loss. Grapevine is one of the most important horticultural crops, with many cultivars in production. The characterization of the genotypes is of undeniably high importance. In this study, we analyzed a dataset of scientific and historical importance with 125 morphological traits of 97 grapevine cultivars described by Németh in 1966. However, the traits are not independent in a set of a large number of categorical traits with too few cultivars. Therefore, the number of traits was first reduced using a simple and effective algorithm to eliminate traits with redundant information content using the asymmetric measure of association Goodman and Kruskal’s λ. We reduced the number of traits from 125 to 59 without any information loss. For the classification, we applied a random forest (RF) method. In this way, 93% of the cultivars were correctly classified using only four traits of the data set. To our knowledge, only a few studies applied a trait elimination algorithm similar to ours in ampelography that can be used for other biological data sets of similar structure. The classification results give a morphological explanation to several cultivars from the Carpathian Basin, a territory where all three Vitis vinifera L. geographical groups, occidentalis, orientalis and pontica, are represented. We found that the information-loss-avoiding data reduction method we applied in our study solved the redundancy-caused interdependencies and provided a suitable dataset for classifying grapevine genotypes. For example, this method may successfully be applied in digital image analysis-based traditional morphometric investigations in ampelography. MDPI 2022-12-08 /pmc/articles/PMC9781146/ /pubmed/36559539 http://dx.doi.org/10.3390/plants11243428 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Szűgyi-Reiczigel, Zsófia Ladányi, Márta Bisztray, György Dénes Varga, Zsuzsanna Bodor-Pesti, Péter Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title | Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title_full | Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title_fullStr | Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title_full_unstemmed | Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title_short | Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars |
title_sort | morphological traits evaluated with random forest method explains natural classification of grapevine (vitis vinifera l.) cultivars |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781146/ https://www.ncbi.nlm.nih.gov/pubmed/36559539 http://dx.doi.org/10.3390/plants11243428 |
work_keys_str_mv | AT szugyireiczigelzsofia morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars AT ladanyimarta morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars AT bisztraygyorgydenes morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars AT vargazsuzsanna morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars AT bodorpestipeter morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars |