Cargando…
Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Association for Cancer Research
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10032232/ https://www.ncbi.nlm.nih.gov/pubmed/36969913 http://dx.doi.org/10.1158/2767-9764.CRC-22-0355 |
_version_ | 1784910752865320960 |
---|---|
author | Guan, Zoe Begg, Colin B. Shen, Ronglai |
author_facet | Guan, Zoe Begg, Colin B. Shen, Ronglai |
author_sort | Guan, Zoe |
collection | PubMed |
description | Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged that patterns defined by these factors are associated with oncogenic pathways, histologic subtypes, and prognosis. It remains an open question whether aggregating germline variants using meta-features capturing their genomic, nucleotide, and epigenetic contexts can improve cancer risk prediction. This aggregation approach can potentially increase statistical power for detecting signals from rare variants, which have been hypothesized to be a major source of the missing heritability of cancer. Using germline whole-exome sequencing data from the UK Biobank, we developed risk models for 10 cancer types using known risk variants (cancer-associated SNPs and pathogenic variants in known cancer predisposition genes) as well as models that additionally include the meta-features. The meta-features did not improve the prediction accuracy of models based on known risk variants. It is possible that expanding the approach to whole-genome sequencing can lead to gains in prediction accuracy. SIGNIFICANCE: There is evidence that cancer is partly caused by rare genetic variants that have not yet been identified. We investigate this issue using novel statistical methods and data from the UK Biobank. |
format | Online Article Text |
id | pubmed-10032232 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Association for Cancer Research |
record_format | MEDLINE/PubMed |
spelling | pubmed-100322322023-03-23 Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach Guan, Zoe Begg, Colin B. Shen, Ronglai Cancer Res Commun Research Article Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged that patterns defined by these factors are associated with oncogenic pathways, histologic subtypes, and prognosis. It remains an open question whether aggregating germline variants using meta-features capturing their genomic, nucleotide, and epigenetic contexts can improve cancer risk prediction. This aggregation approach can potentially increase statistical power for detecting signals from rare variants, which have been hypothesized to be a major source of the missing heritability of cancer. Using germline whole-exome sequencing data from the UK Biobank, we developed risk models for 10 cancer types using known risk variants (cancer-associated SNPs and pathogenic variants in known cancer predisposition genes) as well as models that additionally include the meta-features. The meta-features did not improve the prediction accuracy of models based on known risk variants. It is possible that expanding the approach to whole-genome sequencing can lead to gains in prediction accuracy. SIGNIFICANCE: There is evidence that cancer is partly caused by rare genetic variants that have not yet been identified. We investigate this issue using novel statistical methods and data from the UK Biobank. American Association for Cancer Research 2023-03-22 /pmc/articles/PMC10032232/ /pubmed/36969913 http://dx.doi.org/10.1158/2767-9764.CRC-22-0355 Text en © 2023 The Authors; Published by the American Association for Cancer Research https://creativecommons.org/licenses/by/4.0/This open access article is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. |
spellingShingle | Research Article Guan, Zoe Begg, Colin B. Shen, Ronglai Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title | Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title_full | Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title_fullStr | Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title_full_unstemmed | Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title_short | Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach |
title_sort | predicting cancer risk from germline whole-exome sequencing data using a novel context-based variant aggregation approach |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10032232/ https://www.ncbi.nlm.nih.gov/pubmed/36969913 http://dx.doi.org/10.1158/2767-9764.CRC-22-0355 |
work_keys_str_mv | AT guanzoe predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach AT beggcolinb predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach AT shenronglai predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach |