Cargando…

PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data

Available computational methods for tumor phylogeny inference via single-cell sequencing (SCS) data typically aim to identify the most likely perfect phylogeny tree satisfying the infinite sites assumption (ISA). However, the limitations of SCS technologies including frequent allele dropout and vari...

Descripción completa

Detalles Bibliográficos
Autores principales: Malikic, Salem, Mehrabadi, Farid Rashidi, Ciccolella, Simone, Rahman, Md. Khaledur, Ricketts, Camir, Haghshenas, Ehsan, Seidman, Daniel, Hach, Faraz, Hajirasouliha, Iman, Sahinalp, S. Cenk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6836735/
https://www.ncbi.nlm.nih.gov/pubmed/31628256
http://dx.doi.org/10.1101/gr.234435.118
_version_ 1783466961028513792
author Malikic, Salem
Mehrabadi, Farid Rashidi
Ciccolella, Simone
Rahman, Md. Khaledur
Ricketts, Camir
Haghshenas, Ehsan
Seidman, Daniel
Hach, Faraz
Hajirasouliha, Iman
Sahinalp, S. Cenk
author_facet Malikic, Salem
Mehrabadi, Farid Rashidi
Ciccolella, Simone
Rahman, Md. Khaledur
Ricketts, Camir
Haghshenas, Ehsan
Seidman, Daniel
Hach, Faraz
Hajirasouliha, Iman
Sahinalp, S. Cenk
author_sort Malikic, Salem
collection PubMed
description Available computational methods for tumor phylogeny inference via single-cell sequencing (SCS) data typically aim to identify the most likely perfect phylogeny tree satisfying the infinite sites assumption (ISA). However, the limitations of SCS technologies including frequent allele dropout and variable sequence coverage may prohibit a perfect phylogeny. In addition, ISA violations are commonly observed in tumor phylogenies due to the loss of heterozygosity, deletions, and convergent evolution. In order to address such limitations, we introduce the optimal subperfect phylogeny problem which asks to integrate SCS data with matching bulk sequencing data by minimizing a linear combination of potential false negatives (due to allele dropout or variance in sequence coverage), false positives (due to read errors) among mutation calls, and the number of mutations that violate ISA (real or because of incorrect copy number estimation). We then describe a combinatorial formulation to solve this problem which ensures that several lineage constraints imposed by the use of variant allele frequencies (VAFs, derived from bulk sequence data) are satisfied. We express our formulation both in the form of an integer linear program (ILP) and—as a first in tumor phylogeny reconstruction—a Boolean constraint satisfaction problem (CSP) and solve them by leveraging state-of-the-art ILP/CSP solvers. The resulting method, which we name PhISCS, is the first to integrate SCS and bulk sequencing data while accounting for ISA violating mutations. In contrast to the alternative methods, typically based on probabilistic approaches, PhISCS provides a guarantee of optimality in reported solutions. Using simulated and real data sets, we demonstrate that PhISCS is more general and accurate than all available approaches.
format Online
Article
Text
id pubmed-6836735
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-68367352019-11-20 PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data Malikic, Salem Mehrabadi, Farid Rashidi Ciccolella, Simone Rahman, Md. Khaledur Ricketts, Camir Haghshenas, Ehsan Seidman, Daniel Hach, Faraz Hajirasouliha, Iman Sahinalp, S. Cenk Genome Res Method Available computational methods for tumor phylogeny inference via single-cell sequencing (SCS) data typically aim to identify the most likely perfect phylogeny tree satisfying the infinite sites assumption (ISA). However, the limitations of SCS technologies including frequent allele dropout and variable sequence coverage may prohibit a perfect phylogeny. In addition, ISA violations are commonly observed in tumor phylogenies due to the loss of heterozygosity, deletions, and convergent evolution. In order to address such limitations, we introduce the optimal subperfect phylogeny problem which asks to integrate SCS data with matching bulk sequencing data by minimizing a linear combination of potential false negatives (due to allele dropout or variance in sequence coverage), false positives (due to read errors) among mutation calls, and the number of mutations that violate ISA (real or because of incorrect copy number estimation). We then describe a combinatorial formulation to solve this problem which ensures that several lineage constraints imposed by the use of variant allele frequencies (VAFs, derived from bulk sequence data) are satisfied. We express our formulation both in the form of an integer linear program (ILP) and—as a first in tumor phylogeny reconstruction—a Boolean constraint satisfaction problem (CSP) and solve them by leveraging state-of-the-art ILP/CSP solvers. The resulting method, which we name PhISCS, is the first to integrate SCS and bulk sequencing data while accounting for ISA violating mutations. In contrast to the alternative methods, typically based on probabilistic approaches, PhISCS provides a guarantee of optimality in reported solutions. Using simulated and real data sets, we demonstrate that PhISCS is more general and accurate than all available approaches. Cold Spring Harbor Laboratory Press 2019-11 /pmc/articles/PMC6836735/ /pubmed/31628256 http://dx.doi.org/10.1101/gr.234435.118 Text en © 2019 Malikic et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Method
Malikic, Salem
Mehrabadi, Farid Rashidi
Ciccolella, Simone
Rahman, Md. Khaledur
Ricketts, Camir
Haghshenas, Ehsan
Seidman, Daniel
Hach, Faraz
Hajirasouliha, Iman
Sahinalp, S. Cenk
PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title_full PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title_fullStr PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title_full_unstemmed PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title_short PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
title_sort phiscs: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6836735/
https://www.ncbi.nlm.nih.gov/pubmed/31628256
http://dx.doi.org/10.1101/gr.234435.118
work_keys_str_mv AT malikicsalem phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT mehrabadifaridrashidi phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT ciccolellasimone phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT rahmanmdkhaledur phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT rickettscamir phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT haghshenasehsan phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT seidmandaniel phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT hachfaraz phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT hajirasoulihaiman phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata
AT sahinalpscenk phiscsacombinatorialapproachforsubperfecttumorphylogenyreconstructionviaintegrativeuseofsinglecellandbulksequencingdata