Cargando…

A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data

Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations in...

Descripción completa

Detalles Bibliográficos
Autores principales: Hajirasouliha, Iman, Mahmoody, Ahmad, Raphael, Benjamin J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058927/
https://www.ncbi.nlm.nih.gov/pubmed/24932008
http://dx.doi.org/10.1093/bioinformatics/btu284
_version_ 1782321187808870400
author Hajirasouliha, Iman
Mahmoody, Ahmad
Raphael, Benjamin J.
author_facet Hajirasouliha, Iman
Mahmoody, Ahmad
Raphael, Benjamin J.
author_sort Hajirasouliha, Iman
collection PubMed
description Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact: braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4058927
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40589272014-06-18 A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data Hajirasouliha, Iman Mahmoody, Ahmad Raphael, Benjamin J. Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: High-throughput sequencing of tumor samples has shown that most tumors exhibit extensive intra-tumor heterogeneity, with multiple subpopulations of tumor cells containing different somatic mutations. Recent studies have quantified this intra-tumor heterogeneity by clustering mutations into subpopulations according to the observed counts of DNA sequencing reads containing the variant allele. However, these clustering approaches do not consider that the population frequencies of different tumor subpopulations are correlated by their shared ancestry in the same population of cells. Results: We introduce the binary tree partition (BTP), a novel combinatorial formulation of the problem of constructing the subpopulations of tumor cells from the variant allele frequencies of somatic mutations. We show that finding a BTP is an NP-complete problem; derive an approximation algorithm for an optimization version of the problem; and present a recursive algorithm to find a BTP with errors in the input. We show that the resulting algorithm outperforms existing clustering approaches on simulated and real sequencing data. Availability and implementation: Python and MATLAB implementations of our method are available at http://compbio.cs.brown.edu/software/ Contact: braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058927/ /pubmed/24932008 http://dx.doi.org/10.1093/bioinformatics/btu284 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2014 Proceedings Papers Committee
Hajirasouliha, Iman
Mahmoody, Ahmad
Raphael, Benjamin J.
A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title_full A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title_fullStr A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title_full_unstemmed A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title_short A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
title_sort combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data
topic Ismb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058927/
https://www.ncbi.nlm.nih.gov/pubmed/24932008
http://dx.doi.org/10.1093/bioinformatics/btu284
work_keys_str_mv AT hajirasoulihaiman acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata
AT mahmoodyahmad acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata
AT raphaelbenjaminj acombinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata
AT hajirasoulihaiman combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata
AT mahmoodyahmad combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata
AT raphaelbenjaminj combinatorialapproachforanalyzingintratumorheterogeneityfromhighthroughputsequencingdata