Cargando…

MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction

BACKGROUND: With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection metho...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Ganghui, Gao, Jing, Zuo, Dongshi, Li, Jin, Li, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566073/
https://www.ncbi.nlm.nih.gov/pubmed/37817077
http://dx.doi.org/10.1186/s12859-023-05514-7
_version_ 1785118840542199808
author Zhou, Ganghui
Gao, Jing
Zuo, Dongshi
Li, Jin
Li, Rui
author_facet Zhou, Ganghui
Gao, Jing
Zuo, Dongshi
Li, Jin
Li, Rui
author_sort Zhou, Ganghui
collection PubMed
description BACKGROUND: With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. RESULTS: This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R(2) and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R(2) is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP. CONCLUSIONS: The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection.
format Online
Article
Text
id pubmed-10566073
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105660732023-10-12 MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction Zhou, Ganghui Gao, Jing Zuo, Dongshi Li, Jin Li, Rui BMC Bioinformatics Research BACKGROUND: With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. RESULTS: This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R(2) and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R(2) is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP. CONCLUSIONS: The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection. BioMed Central 2023-10-11 /pmc/articles/PMC10566073/ /pubmed/37817077 http://dx.doi.org/10.1186/s12859-023-05514-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhou, Ganghui
Gao, Jing
Zuo, Dongshi
Li, Jin
Li, Rui
MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_full MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_fullStr MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_full_unstemmed MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_short MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_sort msxfgp: combining improved sparrow search algorithm with xgboost for enhanced genomic prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10566073/
https://www.ncbi.nlm.nih.gov/pubmed/37817077
http://dx.doi.org/10.1186/s12859-023-05514-7
work_keys_str_mv AT zhouganghui msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT gaojing msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT zuodongshi msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT lijin msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT lirui msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction