Cargando…
Cancer mutational signatures representation by large-scale context embedding
MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the compu...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355300/ https://www.ncbi.nlm.nih.gov/pubmed/32657413 http://dx.doi.org/10.1093/bioinformatics/btaa433 |
_version_ | 1783558247887667200 |
---|---|
author | Zhang, Yang Xiao, Yunxuan Yang, Muyu Ma, Jian |
author_facet | Zhang, Yang Xiao, Yunxuan Yang, Muyu Ma, Jian |
author_sort | Zhang, Yang |
collection | PubMed |
description | MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. RESULTS: Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. AVAILABILITY AND IMPLEMENTATION: Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7355300 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73553002020-07-16 Cancer mutational signatures representation by large-scale context embedding Zhang, Yang Xiao, Yunxuan Yang, Muyu Ma, Jian Bioinformatics Macromolecular Sequence, Structure, and Function MOTIVATION: The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. RESULTS: Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. AVAILABILITY AND IMPLEMENTATION: Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355300/ /pubmed/32657413 http://dx.doi.org/10.1093/bioinformatics/btaa433 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Macromolecular Sequence, Structure, and Function Zhang, Yang Xiao, Yunxuan Yang, Muyu Ma, Jian Cancer mutational signatures representation by large-scale context embedding |
title | Cancer mutational signatures representation by large-scale context embedding |
title_full | Cancer mutational signatures representation by large-scale context embedding |
title_fullStr | Cancer mutational signatures representation by large-scale context embedding |
title_full_unstemmed | Cancer mutational signatures representation by large-scale context embedding |
title_short | Cancer mutational signatures representation by large-scale context embedding |
title_sort | cancer mutational signatures representation by large-scale context embedding |
topic | Macromolecular Sequence, Structure, and Function |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355300/ https://www.ncbi.nlm.nih.gov/pubmed/32657413 http://dx.doi.org/10.1093/bioinformatics/btaa433 |
work_keys_str_mv | AT zhangyang cancermutationalsignaturesrepresentationbylargescalecontextembedding AT xiaoyunxuan cancermutationalsignaturesrepresentationbylargescalecontextembedding AT yangmuyu cancermutationalsignaturesrepresentationbylargescalecontextembedding AT majian cancermutationalsignaturesrepresentationbylargescalecontextembedding |