Cargando…

Cancer mutational signatures representation by large-scale context embedding

MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the compu...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yang, Xiao, Yunxuan, Yang, Muyu, Ma, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355300/
https://www.ncbi.nlm.nih.gov/pubmed/32657413
http://dx.doi.org/10.1093/bioinformatics/btaa433
Descripción
Sumario:MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. RESULTS: Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. AVAILABILITY AND IMPLEMENTATION: Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.