Cargando…

Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants

In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and soma...

Descripción completa

Detalles Bibliográficos
Autores principales: Feizi, Nikta, Liu, Qian, Murphy, Leigh, Hu, Pingzhao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8804317/
https://www.ncbi.nlm.nih.gov/pubmed/35116056
http://dx.doi.org/10.3389/fgene.2021.805656
_version_ 1784643050590437376
author Feizi, Nikta
Liu, Qian
Murphy, Leigh
Hu, Pingzhao
author_facet Feizi, Nikta
Liu, Qian
Murphy, Leigh
Hu, Pingzhao
author_sort Feizi, Nikta
collection PubMed
description In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and somatic variants. Significance of somatic variants in cancer initiation and progression urges for development of classifiers specialized for classifying pathogenic status of cancer somatic variants based on the model trained on cancer somatic variants. We established a gold standard exclusively for cancer somatic single nucleotide variants (SNVs) collected from the catalogue of somatic mutations in cancer. We developed two support vector machine (SVM) classifiers based on genomic features of cancer somatic SNVs located in coding and non-coding regions of the genome, respectively. The SVM classifiers achieved the area under the ROC curve of 0.94 and 0.89 regarding the classification of the pathogenic status of coding and non-coding cancer somatic SNVs, respectively. Our models outperform two well-known classification tools including FATHMM-FX and CScape in classifying both coding and non-coding cancer somatic variants. Furthermore, we applied our models to predict the pathogenic status of somatic variants identified in young breast cancer patients from METABRIC and TCGA-BRCA studies. The results indicated that using the classification threshold of 0.8 our “coding” model predicted 1853 positive SNVs (out of 6,910) from the TCGA-BRCA dataset, and 500 positive SNVs (out of 1882) from the METABRIC dataset. Interestingly, through comparative survival analysis of the positive predictions from our models, we identified a young-specific pathogenic somatic variant with potential for the prognosis of early onset of breast cancer in young women.
format Online
Article
Text
id pubmed-8804317
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-88043172022-02-02 Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants Feizi, Nikta Liu, Qian Murphy, Leigh Hu, Pingzhao Front Genet Genetics In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and somatic variants. Significance of somatic variants in cancer initiation and progression urges for development of classifiers specialized for classifying pathogenic status of cancer somatic variants based on the model trained on cancer somatic variants. We established a gold standard exclusively for cancer somatic single nucleotide variants (SNVs) collected from the catalogue of somatic mutations in cancer. We developed two support vector machine (SVM) classifiers based on genomic features of cancer somatic SNVs located in coding and non-coding regions of the genome, respectively. The SVM classifiers achieved the area under the ROC curve of 0.94 and 0.89 regarding the classification of the pathogenic status of coding and non-coding cancer somatic SNVs, respectively. Our models outperform two well-known classification tools including FATHMM-FX and CScape in classifying both coding and non-coding cancer somatic variants. Furthermore, we applied our models to predict the pathogenic status of somatic variants identified in young breast cancer patients from METABRIC and TCGA-BRCA studies. The results indicated that using the classification threshold of 0.8 our “coding” model predicted 1853 positive SNVs (out of 6,910) from the TCGA-BRCA dataset, and 500 positive SNVs (out of 1882) from the METABRIC dataset. Interestingly, through comparative survival analysis of the positive predictions from our models, we identified a young-specific pathogenic somatic variant with potential for the prognosis of early onset of breast cancer in young women. Frontiers Media S.A. 2022-01-18 /pmc/articles/PMC8804317/ /pubmed/35116056 http://dx.doi.org/10.3389/fgene.2021.805656 Text en Copyright © 2022 Feizi, Liu, Murphy and Hu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Feizi, Nikta
Liu, Qian
Murphy, Leigh
Hu, Pingzhao
Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title_full Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title_fullStr Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title_full_unstemmed Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title_short Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
title_sort computational prediction of the pathogenic status of cancer-specific somatic variants
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8804317/
https://www.ncbi.nlm.nih.gov/pubmed/35116056
http://dx.doi.org/10.3389/fgene.2021.805656
work_keys_str_mv AT feizinikta computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants
AT liuqian computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants
AT murphyleigh computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants
AT hupingzhao computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants