Cargando…
Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants
In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and soma...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8804317/ https://www.ncbi.nlm.nih.gov/pubmed/35116056 http://dx.doi.org/10.3389/fgene.2021.805656 |
_version_ | 1784643050590437376 |
---|---|
author | Feizi, Nikta Liu, Qian Murphy, Leigh Hu, Pingzhao |
author_facet | Feizi, Nikta Liu, Qian Murphy, Leigh Hu, Pingzhao |
author_sort | Feizi, Nikta |
collection | PubMed |
description | In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and somatic variants. Significance of somatic variants in cancer initiation and progression urges for development of classifiers specialized for classifying pathogenic status of cancer somatic variants based on the model trained on cancer somatic variants. We established a gold standard exclusively for cancer somatic single nucleotide variants (SNVs) collected from the catalogue of somatic mutations in cancer. We developed two support vector machine (SVM) classifiers based on genomic features of cancer somatic SNVs located in coding and non-coding regions of the genome, respectively. The SVM classifiers achieved the area under the ROC curve of 0.94 and 0.89 regarding the classification of the pathogenic status of coding and non-coding cancer somatic SNVs, respectively. Our models outperform two well-known classification tools including FATHMM-FX and CScape in classifying both coding and non-coding cancer somatic variants. Furthermore, we applied our models to predict the pathogenic status of somatic variants identified in young breast cancer patients from METABRIC and TCGA-BRCA studies. The results indicated that using the classification threshold of 0.8 our “coding” model predicted 1853 positive SNVs (out of 6,910) from the TCGA-BRCA dataset, and 500 positive SNVs (out of 1882) from the METABRIC dataset. Interestingly, through comparative survival analysis of the positive predictions from our models, we identified a young-specific pathogenic somatic variant with potential for the prognosis of early onset of breast cancer in young women. |
format | Online Article Text |
id | pubmed-8804317 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-88043172022-02-02 Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants Feizi, Nikta Liu, Qian Murphy, Leigh Hu, Pingzhao Front Genet Genetics In-silico classification of the pathogenic status of somatic variants is shown to be promising in promoting the clinical utilization of genetic tests. Majority of the available classification tools are designed based on the characteristics of germline variants or the combination of germline and somatic variants. Significance of somatic variants in cancer initiation and progression urges for development of classifiers specialized for classifying pathogenic status of cancer somatic variants based on the model trained on cancer somatic variants. We established a gold standard exclusively for cancer somatic single nucleotide variants (SNVs) collected from the catalogue of somatic mutations in cancer. We developed two support vector machine (SVM) classifiers based on genomic features of cancer somatic SNVs located in coding and non-coding regions of the genome, respectively. The SVM classifiers achieved the area under the ROC curve of 0.94 and 0.89 regarding the classification of the pathogenic status of coding and non-coding cancer somatic SNVs, respectively. Our models outperform two well-known classification tools including FATHMM-FX and CScape in classifying both coding and non-coding cancer somatic variants. Furthermore, we applied our models to predict the pathogenic status of somatic variants identified in young breast cancer patients from METABRIC and TCGA-BRCA studies. The results indicated that using the classification threshold of 0.8 our “coding” model predicted 1853 positive SNVs (out of 6,910) from the TCGA-BRCA dataset, and 500 positive SNVs (out of 1882) from the METABRIC dataset. Interestingly, through comparative survival analysis of the positive predictions from our models, we identified a young-specific pathogenic somatic variant with potential for the prognosis of early onset of breast cancer in young women. Frontiers Media S.A. 2022-01-18 /pmc/articles/PMC8804317/ /pubmed/35116056 http://dx.doi.org/10.3389/fgene.2021.805656 Text en Copyright © 2022 Feizi, Liu, Murphy and Hu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Feizi, Nikta Liu, Qian Murphy, Leigh Hu, Pingzhao Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title | Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title_full | Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title_fullStr | Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title_full_unstemmed | Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title_short | Computational Prediction of the Pathogenic Status of Cancer-Specific Somatic Variants |
title_sort | computational prediction of the pathogenic status of cancer-specific somatic variants |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8804317/ https://www.ncbi.nlm.nih.gov/pubmed/35116056 http://dx.doi.org/10.3389/fgene.2021.805656 |
work_keys_str_mv | AT feizinikta computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants AT liuqian computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants AT murphyleigh computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants AT hupingzhao computationalpredictionofthepathogenicstatusofcancerspecificsomaticvariants |