Cargando…
A pan-cancer landscape of somatic mutations in non-unique regions of the human genome
A substantial fraction of the human genome displays high sequence similarity with at least one other genomic sequence, posing a challenge for the identification of somatic mutations from short-read sequencing data. We here annotate genomic variants in 2,658 cancers from the Pan-Cancer Analysis of Wh...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612106/ https://www.ncbi.nlm.nih.gov/pubmed/34282324 http://dx.doi.org/10.1038/s41587-021-00971-y |
_version_ | 1783605334770712576 |
---|---|
author | Tarabichi, Maxime Demeulemeester, Jonas Verfaillie, Annelien Flanagan, Adrienne M. Van Loo, Peter Konopka, Tomasz |
author_facet | Tarabichi, Maxime Demeulemeester, Jonas Verfaillie, Annelien Flanagan, Adrienne M. Van Loo, Peter Konopka, Tomasz |
author_sort | Tarabichi, Maxime |
collection | PubMed |
description | A substantial fraction of the human genome displays high sequence similarity with at least one other genomic sequence, posing a challenge for the identification of somatic mutations from short-read sequencing data. We here annotate genomic variants in 2,658 cancers from the Pan-Cancer Analysis of Whole Genomes (PCAWG) cohort with links to similar sites across the genome. We train a machine-learning model to use signals distributed over multiple genomic sites to call somatic events in non-unique regions and validate it against linked-read sequencing in an independent dataset. Using this approach, we uncover previously hidden mutations in ~1,700 coding sequences and in thousands of regulatory elements, including in known cancer genes, immunoglobulins, and highly mutated gene families. Mutations in non-unique regions are consistent with mutations in unique regions in terms of mutation burden and substitution profiles. The analysis provides a systematic summary of the mutation events in non-unique regions at a genome-wide scale across multiple human cancers. |
format | Online Article Text |
id | pubmed-7612106 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-76121062022-01-19 A pan-cancer landscape of somatic mutations in non-unique regions of the human genome Tarabichi, Maxime Demeulemeester, Jonas Verfaillie, Annelien Flanagan, Adrienne M. Van Loo, Peter Konopka, Tomasz Nat Biotechnol Article A substantial fraction of the human genome displays high sequence similarity with at least one other genomic sequence, posing a challenge for the identification of somatic mutations from short-read sequencing data. We here annotate genomic variants in 2,658 cancers from the Pan-Cancer Analysis of Whole Genomes (PCAWG) cohort with links to similar sites across the genome. We train a machine-learning model to use signals distributed over multiple genomic sites to call somatic events in non-unique regions and validate it against linked-read sequencing in an independent dataset. Using this approach, we uncover previously hidden mutations in ~1,700 coding sequences and in thousands of regulatory elements, including in known cancer genes, immunoglobulins, and highly mutated gene families. Mutations in non-unique regions are consistent with mutations in unique regions in terms of mutation burden and substitution profiles. The analysis provides a systematic summary of the mutation events in non-unique regions at a genome-wide scale across multiple human cancers. 2021-12-01 2021-07-19 /pmc/articles/PMC7612106/ /pubmed/34282324 http://dx.doi.org/10.1038/s41587-021-00971-y Text en http://www.nature.com/authors/editorial_policies/license.html#termsUsers may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Tarabichi, Maxime Demeulemeester, Jonas Verfaillie, Annelien Flanagan, Adrienne M. Van Loo, Peter Konopka, Tomasz A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title | A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title_full | A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title_fullStr | A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title_full_unstemmed | A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title_short | A pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
title_sort | pan-cancer landscape of somatic mutations in non-unique regions of the human genome |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7612106/ https://www.ncbi.nlm.nih.gov/pubmed/34282324 http://dx.doi.org/10.1038/s41587-021-00971-y |
work_keys_str_mv | AT tarabichimaxime apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT demeulemeesterjonas apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT verfaillieannelien apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT flanaganadriennem apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT vanloopeter apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT konopkatomasz apancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT tarabichimaxime pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT demeulemeesterjonas pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT verfaillieannelien pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT flanaganadriennem pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT vanloopeter pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome AT konopkatomasz pancancerlandscapeofsomaticmutationsinnonuniqueregionsofthehumangenome |