Cargando…

BnpC: Bayesian non-parametric clustering of single-cell mutation profiles

MOTIVATION: The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intratumor heterogeneity (ITH) by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq datasets and technical limitations, such as high...

Descripción completa

Detalles Bibliográficos
Autores principales: Borgsmüller, Nico, Bonet, Jose, Marass, Francesco, Gonzalez-Perez, Abel, Lopez-Bigas, Nuria, Beerenwinkel, Niko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7750970/
https://www.ncbi.nlm.nih.gov/pubmed/32592465
http://dx.doi.org/10.1093/bioinformatics/btaa599
_version_ 1783625581313654784
author Borgsmüller, Nico
Bonet, Jose
Marass, Francesco
Gonzalez-Perez, Abel
Lopez-Bigas, Nuria
Beerenwinkel, Niko
author_facet Borgsmüller, Nico
Bonet, Jose
Marass, Francesco
Gonzalez-Perez, Abel
Lopez-Bigas, Nuria
Beerenwinkel, Niko
author_sort Borgsmüller, Nico
collection PubMed
description MOTIVATION: The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intratumor heterogeneity (ITH) by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq datasets and technical limitations, such as high error rates and a large proportion of missing values, complicate this task and limit the applicability of existing methods. RESULTS: Here, we introduce BnpC, a novel non-parametric method to cluster individual cells into clones and infer their genotypes based on their noisy mutation profiles. We benchmarked our method comprehensively against state-of-the-art methods on simulated data using various data sizes, and applied it to three cancer scDNA-seq datasets. On simulated data, BnpC compared favorably against current methods in terms of accuracy, runtime and scalability. Its inferred genotypes were the most accurate, especially on highly heterogeneous data, and it was the only method able to run and produce results on datasets with 5000 cells. On tumor scDNA-seq data, BnpC was able to identify clonal populations missed by the original cluster analysis but supported by Supplementary Experimental Data. With ever growing scDNA-seq datasets, scalable and accurate methods such as BnpC will become increasingly relevant, not only to resolve ITH but also as a preprocessing step to reduce data size. AVAILABILITY AND IMPLEMENTATION: BnpC is freely available under MIT license at https://github.com/cbg-ethz/BnpC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7750970
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77509702020-12-28 BnpC: Bayesian non-parametric clustering of single-cell mutation profiles Borgsmüller, Nico Bonet, Jose Marass, Francesco Gonzalez-Perez, Abel Lopez-Bigas, Nuria Beerenwinkel, Niko Bioinformatics Original Papers MOTIVATION: The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intratumor heterogeneity (ITH) by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq datasets and technical limitations, such as high error rates and a large proportion of missing values, complicate this task and limit the applicability of existing methods. RESULTS: Here, we introduce BnpC, a novel non-parametric method to cluster individual cells into clones and infer their genotypes based on their noisy mutation profiles. We benchmarked our method comprehensively against state-of-the-art methods on simulated data using various data sizes, and applied it to three cancer scDNA-seq datasets. On simulated data, BnpC compared favorably against current methods in terms of accuracy, runtime and scalability. Its inferred genotypes were the most accurate, especially on highly heterogeneous data, and it was the only method able to run and produce results on datasets with 5000 cells. On tumor scDNA-seq data, BnpC was able to identify clonal populations missed by the original cluster analysis but supported by Supplementary Experimental Data. With ever growing scDNA-seq datasets, scalable and accurate methods such as BnpC will become increasingly relevant, not only to resolve ITH but also as a preprocessing step to reduce data size. AVAILABILITY AND IMPLEMENTATION: BnpC is freely available under MIT license at https://github.com/cbg-ethz/BnpC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-06-27 /pmc/articles/PMC7750970/ /pubmed/32592465 http://dx.doi.org/10.1093/bioinformatics/btaa599 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Borgsmüller, Nico
Bonet, Jose
Marass, Francesco
Gonzalez-Perez, Abel
Lopez-Bigas, Nuria
Beerenwinkel, Niko
BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title_full BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title_fullStr BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title_full_unstemmed BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title_short BnpC: Bayesian non-parametric clustering of single-cell mutation profiles
title_sort bnpc: bayesian non-parametric clustering of single-cell mutation profiles
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7750970/
https://www.ncbi.nlm.nih.gov/pubmed/32592465
http://dx.doi.org/10.1093/bioinformatics/btaa599
work_keys_str_mv AT borgsmullernico bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles
AT bonetjose bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles
AT marassfrancesco bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles
AT gonzalezperezabel bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles
AT lopezbigasnuria bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles
AT beerenwinkelniko bnpcbayesiannonparametricclusteringofsinglecellmutationprofiles