Cargando…

Learning mutational signatures and their multidimensional genomic properties with TensorSignatures

We present TensorSignatures, an algorithm to learn mutational signatures jointly across different variant categories and their genomic localisation and properties. The analysis of 2778 primary and 3824 metastatic cancer genomes of the PCAWG consortium and the HMF cohort shows that all signatures ope...

Descripción completa

Detalles Bibliográficos
Autores principales: Vöhringer, Harald, Hoeck, Arne Van, Cuppen, Edwin, Gerstung, Moritz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206343/
https://www.ncbi.nlm.nih.gov/pubmed/34131135
http://dx.doi.org/10.1038/s41467-021-23551-9
Descripción
Sumario:We present TensorSignatures, an algorithm to learn mutational signatures jointly across different variant categories and their genomic localisation and properties. The analysis of 2778 primary and 3824 metastatic cancer genomes of the PCAWG consortium and the HMF cohort shows that all signatures operate dynamically in response to genomic states. The analysis pins differential spectra of UV mutagenesis found in active and inactive chromatin to global genome nucleotide excision repair. TensorSignatures accurately characterises transcription-associated mutagenesis in 7 different cancer types. The algorithm also extracts distinct signatures of replication- and double strand break repair-driven mutagenesis by APOBEC3A and 3B with differential numbers and length of mutation clusters. Finally, TensorSignatures reproduces a signature of somatic hypermutation generating highly clustered variants at transcription start sites of active genes in lymphoid leukaemia, distinct from a general and less clustered signature of Polη-driven translesion synthesis found in a broad range of cancer types. In summary, TensorSignatures elucidates complex mutational footprints by characterising their underlying processes with respect to a multitude of genomic variables.