Cargando…

Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants

Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not acc...

Descripción completa

Detalles Bibliográficos
Autores principales: Khandakji, Mohannad N., Mifsud, Borbala
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561395/
https://www.ncbi.nlm.nih.gov/pubmed/36246618
http://dx.doi.org/10.3389/fgene.2022.982930
_version_ 1784807944129347584
author Khandakji, Mohannad N.
Mifsud, Borbala
author_facet Khandakji, Mohannad N.
Mifsud, Borbala
author_sort Khandakji, Mohannad N.
collection PubMed
description Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants. Local, gene-specific information have been shown to aid variant pathogenicity prediction; therefore, our aim was to develop a BRCA2-specific machine learning model to predict pathogenicity of all types of BRCA2 variants. Methods: We developed an XGBoost-based machine learning model to predict pathogenicity of BRCA2 variants. The model utilizes general variant information such as position, frequency, and consequence for the canonical BRCA2 transcript, as well as deleteriousness prediction scores from several tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores. Results: The novel gene-specific model predicted the pathogenicity of ENIGMA BRCA2 variants with an accuracy of 99.9%. The model also performed excellently on predicting the functional consequence of the independent set of variants (accuracy was up to 91.3%). Conclusion: This new, gene-specific model is an accurate method for interpreting the pathogenicity of variants in the BRCA2 gene. It is a valuable addition for variant classification and can prioritize unreviewed variants for functional analysis or expert review.
format Online
Article
Text
id pubmed-9561395
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95613952022-10-15 Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants Khandakji, Mohannad N. Mifsud, Borbala Front Genet Genetics Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants. Local, gene-specific information have been shown to aid variant pathogenicity prediction; therefore, our aim was to develop a BRCA2-specific machine learning model to predict pathogenicity of all types of BRCA2 variants. Methods: We developed an XGBoost-based machine learning model to predict pathogenicity of BRCA2 variants. The model utilizes general variant information such as position, frequency, and consequence for the canonical BRCA2 transcript, as well as deleteriousness prediction scores from several tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores. Results: The novel gene-specific model predicted the pathogenicity of ENIGMA BRCA2 variants with an accuracy of 99.9%. The model also performed excellently on predicting the functional consequence of the independent set of variants (accuracy was up to 91.3%). Conclusion: This new, gene-specific model is an accurate method for interpreting the pathogenicity of variants in the BRCA2 gene. It is a valuable addition for variant classification and can prioritize unreviewed variants for functional analysis or expert review. Frontiers Media S.A. 2022-09-30 /pmc/articles/PMC9561395/ /pubmed/36246618 http://dx.doi.org/10.3389/fgene.2022.982930 Text en Copyright © 2022 Khandakji and Mifsud. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Khandakji, Mohannad N.
Mifsud, Borbala
Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title_full Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title_fullStr Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title_full_unstemmed Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title_short Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
title_sort gene-specific machine learning model to predict the pathogenicity of brca2 variants
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561395/
https://www.ncbi.nlm.nih.gov/pubmed/36246618
http://dx.doi.org/10.3389/fgene.2022.982930
work_keys_str_mv AT khandakjimohannadn genespecificmachinelearningmodeltopredictthepathogenicityofbrca2variants
AT mifsudborbala genespecificmachinelearningmodeltopredictthepathogenicityofbrca2variants