Cargando…
Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity
The advent of next-generation sequencing (NGS) technologies has revolutionized the field of bioinformatics and genomics, particularly in the area of onco-somatic genetics. NGS has provided a wealth of information about the genetic changes that underlie cancer and has considerably improved our abilit...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10376905/ https://www.ncbi.nlm.nih.gov/pubmed/37508780 http://dx.doi.org/10.3390/bioengineering10070753 |
_version_ | 1785079388757295104 |
---|---|
author | Pellegrino, Eric Camilla, Clara Abbou, Norman Beaufils, Nathalie Pissier, Christel Gabert, Jean Nanni-Metellus, Isabelle Ouafik, L’Houcine |
author_facet | Pellegrino, Eric Camilla, Clara Abbou, Norman Beaufils, Nathalie Pissier, Christel Gabert, Jean Nanni-Metellus, Isabelle Ouafik, L’Houcine |
author_sort | Pellegrino, Eric |
collection | PubMed |
description | The advent of next-generation sequencing (NGS) technologies has revolutionized the field of bioinformatics and genomics, particularly in the area of onco-somatic genetics. NGS has provided a wealth of information about the genetic changes that underlie cancer and has considerably improved our ability to diagnose and treat cancer. However, the large amount of data generated by NGS makes it difficult to interpret the variants. To address this, machine learning algorithms such as Extreme Gradient Boosting (XGBoost) have become increasingly important tools in the analysis of NGS data. In this paper, we present a machine learning tool that uses XGBoost to predict the pathogenicity of a mutation in the myeloid panel. We optimized the performance of XGBoost using metaheuristic algorithms and compared our predictions with the decisions of biologists and other prediction tools. The myeloid panel is a critical component in the diagnosis and treatment of myeloid neoplasms, and the sequencing of this panel allows for the identification of specific genetic mutations, enabling more accurate diagnoses and tailored treatment plans. We used datasets collected from our myeloid panel NGS analysis to train the XGBoost algorithm. It represents a data collection of 15,977 mutations variants composed of a collection of 13,221 Single Nucleotide Variants (SNVs), 73 Multiple Nucleoid Variants (MNVs), and 2683 insertion deletions (INDELs). The optimal XGBoost hyperparameters were found with Differential Evolution (DE), with an accuracy of 99.35%, precision of 98.70%, specificity of 98.71%, and sensitivity of 1. |
format | Online Article Text |
id | pubmed-10376905 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-103769052023-07-29 Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity Pellegrino, Eric Camilla, Clara Abbou, Norman Beaufils, Nathalie Pissier, Christel Gabert, Jean Nanni-Metellus, Isabelle Ouafik, L’Houcine Bioengineering (Basel) Article The advent of next-generation sequencing (NGS) technologies has revolutionized the field of bioinformatics and genomics, particularly in the area of onco-somatic genetics. NGS has provided a wealth of information about the genetic changes that underlie cancer and has considerably improved our ability to diagnose and treat cancer. However, the large amount of data generated by NGS makes it difficult to interpret the variants. To address this, machine learning algorithms such as Extreme Gradient Boosting (XGBoost) have become increasingly important tools in the analysis of NGS data. In this paper, we present a machine learning tool that uses XGBoost to predict the pathogenicity of a mutation in the myeloid panel. We optimized the performance of XGBoost using metaheuristic algorithms and compared our predictions with the decisions of biologists and other prediction tools. The myeloid panel is a critical component in the diagnosis and treatment of myeloid neoplasms, and the sequencing of this panel allows for the identification of specific genetic mutations, enabling more accurate diagnoses and tailored treatment plans. We used datasets collected from our myeloid panel NGS analysis to train the XGBoost algorithm. It represents a data collection of 15,977 mutations variants composed of a collection of 13,221 Single Nucleotide Variants (SNVs), 73 Multiple Nucleoid Variants (MNVs), and 2683 insertion deletions (INDELs). The optimal XGBoost hyperparameters were found with Differential Evolution (DE), with an accuracy of 99.35%, precision of 98.70%, specificity of 98.71%, and sensitivity of 1. MDPI 2023-06-23 /pmc/articles/PMC10376905/ /pubmed/37508780 http://dx.doi.org/10.3390/bioengineering10070753 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Pellegrino, Eric Camilla, Clara Abbou, Norman Beaufils, Nathalie Pissier, Christel Gabert, Jean Nanni-Metellus, Isabelle Ouafik, L’Houcine Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title | Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title_full | Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title_fullStr | Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title_full_unstemmed | Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title_short | Extreme Gradient Boosting Tuned with Metaheuristic Algorithms for Predicting Myeloid NGS Onco-Somatic Variant Pathogenicity |
title_sort | extreme gradient boosting tuned with metaheuristic algorithms for predicting myeloid ngs onco-somatic variant pathogenicity |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10376905/ https://www.ncbi.nlm.nih.gov/pubmed/37508780 http://dx.doi.org/10.3390/bioengineering10070753 |
work_keys_str_mv | AT pellegrinoeric extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT camillaclara extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT abbounorman extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT beaufilsnathalie extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT pissierchristel extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT gabertjean extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT nannimetellusisabelle extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity AT ouafiklhoucine extremegradientboostingtunedwithmetaheuristicalgorithmsforpredictingmyeloidngsoncosomaticvariantpathogenicity |