Cargando…

Cancer mutational signatures representation by large-scale context embedding

MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the compu...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yang, Xiao, Yunxuan, Yang, Muyu, Ma, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355300/
https://www.ncbi.nlm.nih.gov/pubmed/32657413
http://dx.doi.org/10.1093/bioinformatics/btaa433
_version_ 1783558247887667200
author Zhang, Yang
Xiao, Yunxuan
Yang, Muyu
Ma, Jian
author_facet Zhang, Yang
Xiao, Yunxuan
Yang, Muyu
Ma, Jian
author_sort Zhang, Yang
collection PubMed
description MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. RESULTS: Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. AVAILABILITY AND IMPLEMENTATION: Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355300
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73553002020-07-16 Cancer mutational signatures representation by large-scale context embedding Zhang, Yang Xiao, Yunxuan Yang, Muyu Ma, Jian Bioinformatics Macromolecular Sequence, Structure, and Function MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. RESULTS: Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. AVAILABILITY AND IMPLEMENTATION: Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355300/ /pubmed/32657413 http://dx.doi.org/10.1093/bioinformatics/btaa433 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Macromolecular Sequence, Structure, and Function
Zhang, Yang
Xiao, Yunxuan
Yang, Muyu
Ma, Jian
Cancer mutational signatures representation by large-scale context embedding
title Cancer mutational signatures representation by large-scale context embedding
title_full Cancer mutational signatures representation by large-scale context embedding
title_fullStr Cancer mutational signatures representation by large-scale context embedding
title_full_unstemmed Cancer mutational signatures representation by large-scale context embedding
title_short Cancer mutational signatures representation by large-scale context embedding
title_sort cancer mutational signatures representation by large-scale context embedding
topic Macromolecular Sequence, Structure, and Function
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355300/
https://www.ncbi.nlm.nih.gov/pubmed/32657413
http://dx.doi.org/10.1093/bioinformatics/btaa433
work_keys_str_mv AT zhangyang cancermutationalsignaturesrepresentationbylargescalecontextembedding
AT xiaoyunxuan cancermutationalsignaturesrepresentationbylargescalecontextembedding
AT yangmuyu cancermutationalsignaturesrepresentationbylargescalecontextembedding
AT majian cancermutationalsignaturesrepresentationbylargescalecontextembedding