Cargando…
A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations in...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058927/ https://www.ncbi.nlm.nih.gov/pubmed/24932008 http://dx.doi.org/10.1093/bioinformatics/btu284 |
_version_ | 1782321187808870400 |
---|---|
author | Hajirasouliha, Iman Mahmoody, Ahmad Raphael, Benjamin J. |
author_facet | Hajirasouliha, Iman Mahmoody, Ahmad Raphael, Benjamin J. |
author_sort | Hajirasouliha, Iman |
collection | PubMed |
description | Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact: braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4058927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40589272014-06-18 A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data Hajirasouliha, Iman Mahmoody, Ahmad Raphael, Benjamin J. Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact: braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058927/ /pubmed/24932008 http://dx.doi.org/10.1093/bioinformatics/btu284 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2014 Proceedings Papers Committee Hajirasouliha, Iman Mahmoody, Ahmad Raphael, Benjamin J. A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title | A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title_full | A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title_fullStr | A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title_full_unstemmed | A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title_short | A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
title_sort | combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data |
topic | Ismb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058927/ https://www.ncbi.nlm.nih.gov/pubmed/24932008 http://dx.doi.org/10.1093/bioinformatics/btu284 |
work_keys_str_mv | AT hajirasoulihaiman acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata AT mahmoodyahmad acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata AT raphaelbenjaminj acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata AT hajirasoulihaiman combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata AT mahmoodyahmad combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata AT raphaelbenjaminj combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata |