Cargando…

Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars

There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set...

Descripción completa

Detalles Bibliográficos
Autores principales: Szűgyi-Reiczigel, Zsófia, Ladányi, Márta, Bisztray, György Dénes, Varga, Zsuzsanna, Bodor-Pesti, Péter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781146/
https://www.ncbi.nlm.nih.gov/pubmed/36559539
http://dx.doi.org/10.3390/plants11243428
_version_ 1784857002390847488
author Szűgyi-Reiczigel, Zsófia
Ladányi, Márta
Bisztray, György Dénes
Varga, Zsuzsanna
Bodor-Pesti, Péter
author_facet Szűgyi-Reiczigel, Zsófia
Ladányi, Márta
Bisztray, György Dénes
Varga, Zsuzsanna
Bodor-Pesti, Péter
author_sort Szűgyi-Reiczigel, Zsófia
collection PubMed
description There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set. High numbers of investigated characters could cause redundancy, while reducing those numbers may result in data loss. Grapevine is one of the most important horticultural crops, with many cultivars in production. The characterization of the genotypes is of undeniably high importance. In this study, we analyzed a dataset of scientific and historical importance with 125 morphological traits of 97 grapevine cultivars described by Németh in 1966. However, the traits are not independent in a set of a large number of categorical traits with too few cultivars. Therefore, the number of traits was first reduced using a simple and effective algorithm to eliminate traits with redundant information content using the asymmetric measure of association Goodman and Kruskal’s λ. We reduced the number of traits from 125 to 59 without any information loss. For the classification, we applied a random forest (RF) method. In this way, 93% of the cultivars were correctly classified using only four traits of the data set. To our knowledge, only a few studies applied a trait elimination algorithm similar to ours in ampelography that can be used for other biological data sets of similar structure. The classification results give a morphological explanation to several cultivars from the Carpathian Basin, a territory where all three Vitis vinifera L. geographical groups, occidentalis, orientalis and pontica, are represented. We found that the information-loss-avoiding data reduction method we applied in our study solved the redundancy-caused interdependencies and provided a suitable dataset for classifying grapevine genotypes. For example, this method may successfully be applied in digital image analysis-based traditional morphometric investigations in ampelography.
format Online
Article
Text
id pubmed-9781146
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97811462022-12-24 Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars Szűgyi-Reiczigel, Zsófia Ladányi, Márta Bisztray, György Dénes Varga, Zsuzsanna Bodor-Pesti, Péter Plants (Basel) Article There are hundreds of morphologic and morphometric traits available to classify and identify grapevine (Vitis vinifera L.) genotypes, while statistical evaluation of those has certain limitations, especially when we have no information about the traits that are discriminative to a certain sample set. High numbers of investigated characters could cause redundancy, while reducing those numbers may result in data loss. Grapevine is one of the most important horticultural crops, with many cultivars in production. The characterization of the genotypes is of undeniably high importance. In this study, we analyzed a dataset of scientific and historical importance with 125 morphological traits of 97 grapevine cultivars described by Németh in 1966. However, the traits are not independent in a set of a large number of categorical traits with too few cultivars. Therefore, the number of traits was first reduced using a simple and effective algorithm to eliminate traits with redundant information content using the asymmetric measure of association Goodman and Kruskal’s λ. We reduced the number of traits from 125 to 59 without any information loss. For the classification, we applied a random forest (RF) method. In this way, 93% of the cultivars were correctly classified using only four traits of the data set. To our knowledge, only a few studies applied a trait elimination algorithm similar to ours in ampelography that can be used for other biological data sets of similar structure. The classification results give a morphological explanation to several cultivars from the Carpathian Basin, a territory where all three Vitis vinifera L. geographical groups, occidentalis, orientalis and pontica, are represented. We found that the information-loss-avoiding data reduction method we applied in our study solved the redundancy-caused interdependencies and provided a suitable dataset for classifying grapevine genotypes. For example, this method may successfully be applied in digital image analysis-based traditional morphometric investigations in ampelography. MDPI 2022-12-08 /pmc/articles/PMC9781146/ /pubmed/36559539 http://dx.doi.org/10.3390/plants11243428 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Szűgyi-Reiczigel, Zsófia
Ladányi, Márta
Bisztray, György Dénes
Varga, Zsuzsanna
Bodor-Pesti, Péter
Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title_full Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title_fullStr Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title_full_unstemmed Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title_short Morphological Traits Evaluated with Random Forest Method Explains Natural Classification of Grapevine (Vitis vinifera L.) Cultivars
title_sort morphological traits evaluated with random forest method explains natural classification of grapevine (vitis vinifera l.) cultivars
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9781146/
https://www.ncbi.nlm.nih.gov/pubmed/36559539
http://dx.doi.org/10.3390/plants11243428
work_keys_str_mv AT szugyireiczigelzsofia morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars
AT ladanyimarta morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars
AT bisztraygyorgydenes morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars
AT vargazsuzsanna morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars
AT bodorpestipeter morphologicaltraitsevaluatedwithrandomforestmethodexplainsnaturalclassificationofgrapevinevitisviniferalcultivars