Cargando…

A data-driven approach for constructing mutation categories for mutational signature analysis

Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base...

Descripción completa

Detalles Bibliográficos
Autores principales: Gilad, Gal, Leiserson, Mark D. M., Sharan, Roded
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8555780/
https://www.ncbi.nlm.nih.gov/pubmed/34665813
http://dx.doi.org/10.1371/journal.pcbi.1009542
_version_ 1784592045638156288
author Gilad, Gal
Leiserson, Mark D. M.
Sharan, Roded
author_facet Gilad, Gal
Leiserson, Mark D. M.
Sharan, Roded
author_sort Gilad, Gal
collection PubMed
description Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases.
format Online
Article
Text
id pubmed-8555780
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-85557802021-10-30 A data-driven approach for constructing mutation categories for mutational signature analysis Gilad, Gal Leiserson, Mark D. M. Sharan, Roded PLoS Comput Biol Research Article Mutational processes shape the genomes of cancer patients and their understanding has important applications in diagnosis and treatment. Current modeling of mutational processes by identifying their characteristic signatures views each base substitution in a limited context of a single flanking base on each side. This context definition gives rise to 96 categories of mutations that have become the standard in the field, even though wider contexts have been shown to be informative in specific cases. Here we propose a data-driven approach for constructing a mutation categorization for mutational signature analysis. Our approach is based on the assumption that tumor cells that are exposed to similar mutational processes, show similar expression levels of DNA damage repair genes that are involved in these processes. We attempt to find a categorization that maximizes the agreement between mutation and gene expression data, and show that it outperforms the standard categorization over multiple quality measures. Moreover, we show that the categorization we identify generalizes to unseen data from different cancer types, suggesting that mutation context patterns extend beyond the immediate flanking bases. Public Library of Science 2021-10-19 /pmc/articles/PMC8555780/ /pubmed/34665813 http://dx.doi.org/10.1371/journal.pcbi.1009542 Text en © 2021 Gilad et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gilad, Gal
Leiserson, Mark D. M.
Sharan, Roded
A data-driven approach for constructing mutation categories for mutational signature analysis
title A data-driven approach for constructing mutation categories for mutational signature analysis
title_full A data-driven approach for constructing mutation categories for mutational signature analysis
title_fullStr A data-driven approach for constructing mutation categories for mutational signature analysis
title_full_unstemmed A data-driven approach for constructing mutation categories for mutational signature analysis
title_short A data-driven approach for constructing mutation categories for mutational signature analysis
title_sort data-driven approach for constructing mutation categories for mutational signature analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8555780/
https://www.ncbi.nlm.nih.gov/pubmed/34665813
http://dx.doi.org/10.1371/journal.pcbi.1009542
work_keys_str_mv AT giladgal adatadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis
AT leisersonmarkdm adatadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis
AT sharanroded adatadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis
AT giladgal datadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis
AT leisersonmarkdm datadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis
AT sharanroded datadrivenapproachforconstructingmutationcategoriesformutationalsignatureanalysis