Cargando…

An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries

Selecting the best planting area for blueberries is an essential issue in agriculture. To better improve the effectiveness of blueberry cultivation, a machine learning-based classification model for blueberry ecological suitability was proposed for the first time and its validation was conducted by...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Wenfeng, Wang, Xiao, Yang, Jing, Qin, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9961688/
https://www.ncbi.nlm.nih.gov/pubmed/36850409
http://dx.doi.org/10.3390/s23041811
_version_ 1784895816331165696
author Chang, Wenfeng
Wang, Xiao
Yang, Jing
Qin, Tao
author_facet Chang, Wenfeng
Wang, Xiao
Yang, Jing
Qin, Tao
author_sort Chang, Wenfeng
collection PubMed
description Selecting the best planting area for blueberries is an essential issue in agriculture. To better improve the effectiveness of blueberry cultivation, a machine learning-based classification model for blueberry ecological suitability was proposed for the first time and its validation was conducted by using multi-source environmental features data in this paper. The sparrow search algorithm (SSA) was adopted to optimize the CatBoost model and classify the ecological suitability of blueberries based on the selection of data features. Firstly, the Borderline-SMOTE algorithm was used to balance the number of positive and negative samples. The Variance Inflation Factor and information gain methods were applied to filter out the factors affecting the growth of blueberries. Subsequently, the processed data were fed into the CatBoost for training, and the parameters of the CatBoost were optimized to obtain the optimal model using SSA. Finally, the SSA-CatBoost model was adopted to classify the ecological suitability of blueberries and output the suitability types. Taking a study on a blueberry plantation in Majiang County, Guizhou Province, China as an example, the findings demonstrate that the AUC value of the SSA-CatBoost-based blueberry ecological suitability model is 0.921, which is 2.68% higher than that of the CatBoost (AUC = 0.897) and is significantly higher than Logistic Regression (AUC = 0.855), Support Vector Machine (AUC = 0.864), and Random Forest (AUC = 0.875). Furthermore, the ecological suitability of blueberries in Majiang County is mapped according to the classification results of different models. When comparing the actual blueberry cultivation situation in Majiang County, the classification results of the SSA-CatBoost model proposed in this paper matches best with the real blueberry cultivation situation in Majiang County, which is of a high reference value for the selection of blueberry cultivation sites.
format Online
Article
Text
id pubmed-9961688
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99616882023-02-26 An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries Chang, Wenfeng Wang, Xiao Yang, Jing Qin, Tao Sensors (Basel) Article Selecting the best planting area for blueberries is an essential issue in agriculture. To better improve the effectiveness of blueberry cultivation, a machine learning-based classification model for blueberry ecological suitability was proposed for the first time and its validation was conducted by using multi-source environmental features data in this paper. The sparrow search algorithm (SSA) was adopted to optimize the CatBoost model and classify the ecological suitability of blueberries based on the selection of data features. Firstly, the Borderline-SMOTE algorithm was used to balance the number of positive and negative samples. The Variance Inflation Factor and information gain methods were applied to filter out the factors affecting the growth of blueberries. Subsequently, the processed data were fed into the CatBoost for training, and the parameters of the CatBoost were optimized to obtain the optimal model using SSA. Finally, the SSA-CatBoost model was adopted to classify the ecological suitability of blueberries and output the suitability types. Taking a study on a blueberry plantation in Majiang County, Guizhou Province, China as an example, the findings demonstrate that the AUC value of the SSA-CatBoost-based blueberry ecological suitability model is 0.921, which is 2.68% higher than that of the CatBoost (AUC = 0.897) and is significantly higher than Logistic Regression (AUC = 0.855), Support Vector Machine (AUC = 0.864), and Random Forest (AUC = 0.875). Furthermore, the ecological suitability of blueberries in Majiang County is mapped according to the classification results of different models. When comparing the actual blueberry cultivation situation in Majiang County, the classification results of the SSA-CatBoost model proposed in this paper matches best with the real blueberry cultivation situation in Majiang County, which is of a high reference value for the selection of blueberry cultivation sites. MDPI 2023-02-06 /pmc/articles/PMC9961688/ /pubmed/36850409 http://dx.doi.org/10.3390/s23041811 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chang, Wenfeng
Wang, Xiao
Yang, Jing
Qin, Tao
An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title_full An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title_fullStr An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title_full_unstemmed An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title_short An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries
title_sort improved catboost-based classification model for ecological suitability of blueberries
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9961688/
https://www.ncbi.nlm.nih.gov/pubmed/36850409
http://dx.doi.org/10.3390/s23041811
work_keys_str_mv AT changwenfeng animprovedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT wangxiao animprovedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT yangjing animprovedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT qintao animprovedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT changwenfeng improvedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT wangxiao improvedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT yangjing improvedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries
AT qintao improvedcatboostbasedclassificationmodelforecologicalsuitabilityofblueberries