Cargando…
Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants
Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not acc...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561395/ https://www.ncbi.nlm.nih.gov/pubmed/36246618 http://dx.doi.org/10.3389/fgene.2022.982930 |
_version_ | 1784807944129347584 |
---|---|
author | Khandakji, Mohannad N. Mifsud, Borbala |
author_facet | Khandakji, Mohannad N. Mifsud, Borbala |
author_sort | Khandakji, Mohannad N. |
collection | PubMed |
description | Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants. Local, gene-specific information have been shown to aid variant pathogenicity prediction; therefore, our aim was to develop a BRCA2-specific machine learning model to predict pathogenicity of all types of BRCA2 variants. Methods: We developed an XGBoost-based machine learning model to predict pathogenicity of BRCA2 variants. The model utilizes general variant information such as position, frequency, and consequence for the canonical BRCA2 transcript, as well as deleteriousness prediction scores from several tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores. Results: The novel gene-specific model predicted the pathogenicity of ENIGMA BRCA2 variants with an accuracy of 99.9%. The model also performed excellently on predicting the functional consequence of the independent set of variants (accuracy was up to 91.3%). Conclusion: This new, gene-specific model is an accurate method for interpreting the pathogenicity of variants in the BRCA2 gene. It is a valuable addition for variant classification and can prioritize unreviewed variants for functional analysis or expert review. |
format | Online Article Text |
id | pubmed-9561395 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-95613952022-10-15 Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants Khandakji, Mohannad N. Mifsud, Borbala Front Genet Genetics Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants. Local, gene-specific information have been shown to aid variant pathogenicity prediction; therefore, our aim was to develop a BRCA2-specific machine learning model to predict pathogenicity of all types of BRCA2 variants. Methods: We developed an XGBoost-based machine learning model to predict pathogenicity of BRCA2 variants. The model utilizes general variant information such as position, frequency, and consequence for the canonical BRCA2 transcript, as well as deleteriousness prediction scores from several tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores. Results: The novel gene-specific model predicted the pathogenicity of ENIGMA BRCA2 variants with an accuracy of 99.9%. The model also performed excellently on predicting the functional consequence of the independent set of variants (accuracy was up to 91.3%). Conclusion: This new, gene-specific model is an accurate method for interpreting the pathogenicity of variants in the BRCA2 gene. It is a valuable addition for variant classification and can prioritize unreviewed variants for functional analysis or expert review. Frontiers Media S.A. 2022-09-30 /pmc/articles/PMC9561395/ /pubmed/36246618 http://dx.doi.org/10.3389/fgene.2022.982930 Text en Copyright © 2022 Khandakji and Mifsud. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Khandakji, Mohannad N. Mifsud, Borbala Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title | Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title_full | Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title_fullStr | Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title_full_unstemmed | Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title_short | Gene-specific machine learning model to predict the pathogenicity of BRCA2 variants |
title_sort | gene-specific machine learning model to predict the pathogenicity of brca2 variants |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561395/ https://www.ncbi.nlm.nih.gov/pubmed/36246618 http://dx.doi.org/10.3389/fgene.2022.982930 |
work_keys_str_mv | AT khandakjimohannadn genespecificmachinelearningmodeltopredictthepathogenicityofbrca2variants AT mifsudborbala genespecificmachinelearningmodeltopredictthepathogenicityofbrca2variants |