Cargando…

Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach

Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Zoe, Begg, Colin B., Shen, Ronglai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Association for Cancer Research 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10032232/
https://www.ncbi.nlm.nih.gov/pubmed/36969913
http://dx.doi.org/10.1158/2767-9764.CRC-22-0355
_version_ 1784910752865320960
author Guan, Zoe
Begg, Colin B.
Shen, Ronglai
author_facet Guan, Zoe
Begg, Colin B.
Shen, Ronglai
author_sort Guan, Zoe
collection PubMed
description Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged that patterns defined by these factors are associated with oncogenic pathways, histologic subtypes, and prognosis. It remains an open question whether aggregating germline variants using meta-features capturing their genomic, nucleotide, and epigenetic contexts can improve cancer risk prediction. This aggregation approach can potentially increase statistical power for detecting signals from rare variants, which have been hypothesized to be a major source of the missing heritability of cancer. Using germline whole-exome sequencing data from the UK Biobank, we developed risk models for 10 cancer types using known risk variants (cancer-associated SNPs and pathogenic variants in known cancer predisposition genes) as well as models that additionally include the meta-features. The meta-features did not improve the prediction accuracy of models based on known risk variants. It is possible that expanding the approach to whole-genome sequencing can lead to gains in prediction accuracy. SIGNIFICANCE: There is evidence that cancer is partly caused by rare genetic variants that have not yet been identified. We investigate this issue using novel statistical methods and data from the UK Biobank.
format Online
Article
Text
id pubmed-10032232
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Association for Cancer Research
record_format MEDLINE/PubMed
spelling pubmed-100322322023-03-23 Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach Guan, Zoe Begg, Colin B. Shen, Ronglai Cancer Res Commun Research Article Many studies have shown that the distributions of the genomic, nucleotide, and epigenetic contexts of somatic variants in tumors are informative of cancer etiology. Recently, a new direction of research has focused on extracting signals from the contexts of germline variants and evidence has emerged that patterns defined by these factors are associated with oncogenic pathways, histologic subtypes, and prognosis. It remains an open question whether aggregating germline variants using meta-features capturing their genomic, nucleotide, and epigenetic contexts can improve cancer risk prediction. This aggregation approach can potentially increase statistical power for detecting signals from rare variants, which have been hypothesized to be a major source of the missing heritability of cancer. Using germline whole-exome sequencing data from the UK Biobank, we developed risk models for 10 cancer types using known risk variants (cancer-associated SNPs and pathogenic variants in known cancer predisposition genes) as well as models that additionally include the meta-features. The meta-features did not improve the prediction accuracy of models based on known risk variants. It is possible that expanding the approach to whole-genome sequencing can lead to gains in prediction accuracy. SIGNIFICANCE: There is evidence that cancer is partly caused by rare genetic variants that have not yet been identified. We investigate this issue using novel statistical methods and data from the UK Biobank. American Association for Cancer Research 2023-03-22 /pmc/articles/PMC10032232/ /pubmed/36969913 http://dx.doi.org/10.1158/2767-9764.CRC-22-0355 Text en © 2023 The Authors; Published by the American Association for Cancer Research https://creativecommons.org/licenses/by/4.0/This open access article is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
spellingShingle Research Article
Guan, Zoe
Begg, Colin B.
Shen, Ronglai
Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title_full Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title_fullStr Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title_full_unstemmed Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title_short Predicting Cancer Risk from Germline Whole-exome Sequencing Data Using a Novel Context-based Variant Aggregation Approach
title_sort predicting cancer risk from germline whole-exome sequencing data using a novel context-based variant aggregation approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10032232/
https://www.ncbi.nlm.nih.gov/pubmed/36969913
http://dx.doi.org/10.1158/2767-9764.CRC-22-0355
work_keys_str_mv AT guanzoe predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach
AT beggcolinb predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach
AT shenronglai predictingcancerriskfromgermlinewholeexomesequencingdatausinganovelcontextbasedvariantaggregationapproach