Cargando…

Mutational signature learning with supervised negative binomial non-negative matrix factorization

MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the...

Descripción completa

Detalles Bibliográficos
Autores principales: Lyu, Xinrui, Garret, Jean, Rätsch, Gunnar, Lehmann, Kjong-Van
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355241/
https://www.ncbi.nlm.nih.gov/pubmed/32657388
http://dx.doi.org/10.1093/bioinformatics/btaa473
_version_ 1783558234805633024
author Lyu, Xinrui
Garret, Jean
Rätsch, Gunnar
Lehmann, Kjong-Van
author_facet Lyu, Xinrui
Garret, Jean
Rätsch, Gunnar
Lehmann, Kjong-Van
author_sort Lyu, Xinrui
collection PubMed
description MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. RESULTS: Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations. AVAILABILITY AND IMPLEMENTATION: https://github.com/ratschlab/SNBNMF-mutsig-public. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355241
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552412020-07-16 Mutational signature learning with supervised negative binomial non-negative matrix factorization Lyu, Xinrui Garret, Jean Rätsch, Gunnar Lehmann, Kjong-Van Bioinformatics Genomic Variation Analysis MOTIVATION: Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. RESULTS: Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations. AVAILABILITY AND IMPLEMENTATION: https://github.com/ratschlab/SNBNMF-mutsig-public. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355241/ /pubmed/32657388 http://dx.doi.org/10.1093/bioinformatics/btaa473 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Genomic Variation Analysis
Lyu, Xinrui
Garret, Jean
Rätsch, Gunnar
Lehmann, Kjong-Van
Mutational signature learning with supervised negative binomial non-negative matrix factorization
title Mutational signature learning with supervised negative binomial non-negative matrix factorization
title_full Mutational signature learning with supervised negative binomial non-negative matrix factorization
title_fullStr Mutational signature learning with supervised negative binomial non-negative matrix factorization
title_full_unstemmed Mutational signature learning with supervised negative binomial non-negative matrix factorization
title_short Mutational signature learning with supervised negative binomial non-negative matrix factorization
title_sort mutational signature learning with supervised negative binomial non-negative matrix factorization
topic Genomic Variation Analysis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355241/
https://www.ncbi.nlm.nih.gov/pubmed/32657388
http://dx.doi.org/10.1093/bioinformatics/btaa473
work_keys_str_mv AT lyuxinrui mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization
AT garretjean mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization
AT ratschgunnar mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization
AT lehmannkjongvan mutationalsignaturelearningwithsupervisednegativebinomialnonnegativematrixfactorization