Cargando…
Mutational signature learning with supervised negative binomial non-negative matrix factorization
MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355241/ https://www.ncbi.nlm.nih.gov/pubmed/32657388 http://dx.doi.org/10.1093/bioinformatics/btaa473 |
_version_ | 1783558234805633024 |
---|---|
author | Lyu, Xinrui Garret, Jean Rätsch, Gunnar Lehmann, Kjong-Van |
author_facet | Lyu, Xinrui Garret, Jean Rätsch, Gunnar Lehmann, Kjong-Van |
author_sort | Lyu, Xinrui |
collection | PubMed |
description | MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. RESULTS: Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations. AVAILABILITY AND IMPLEMENTATION: https://github.com/ratschlab/SNBNMF-mutsig-public. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7355241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73552412020-07-16 Mutational signature learning with supervised negative binomial non-negative matrix factorization Lyu, Xinrui Garret, Jean Rätsch, Gunnar Lehmann, Kjong-Van Bioinformatics Genomic Variation Analysis MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. RESULTS: Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations. AVAILABILITY AND IMPLEMENTATION: https://github.com/ratschlab/SNBNMF-mutsig-public. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355241/ /pubmed/32657388 http://dx.doi.org/10.1093/bioinformatics/btaa473 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Genomic Variation Analysis Lyu, Xinrui Garret, Jean Rätsch, Gunnar Lehmann, Kjong-Van Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title | Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title_full | Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title_fullStr | Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title_full_unstemmed | Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title_short | Mutational signature learning with supervised negative binomial non-negative matrix factorization |
title_sort | mutational signature learning with supervised negative binomial non-negative matrix factorization |
topic | Genomic Variation Analysis |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355241/ https://www.ncbi.nlm.nih.gov/pubmed/32657388 http://dx.doi.org/10.1093/bioinformatics/btaa473 |
work_keys_str_mv | AT lyuxinrui mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization AT garretjean mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization AT ratschgunnar mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization AT lehmannkjongvan mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization |