Cargando…

Summarizing the solution space in tumor phylogeny inference by multiple consensus trees

MOTIVATION: Cancer phylogenies are key to studying tumorigenesis and have clinical implications. Due to the heterogeneous nature of cancer and limitations in current sequencing technology, current cancer phylogeny inference methods identify a large solution space of plausible phylogenies. To facilit...

Descripción completa

Detalles Bibliográficos
Autores principales: Aguse, Nuraini, Qi, Yuanyuan, El-Kebir, Mohammed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612807/
https://www.ncbi.nlm.nih.gov/pubmed/31510657
http://dx.doi.org/10.1093/bioinformatics/btz312
_version_ 1783432941442957312
author Aguse, Nuraini
Qi, Yuanyuan
El-Kebir, Mohammed
author_facet Aguse, Nuraini
Qi, Yuanyuan
El-Kebir, Mohammed
author_sort Aguse, Nuraini
collection PubMed
description MOTIVATION: Cancer phylogenies are key to studying tumorigenesis and have clinical implications. Due to the heterogeneous nature of cancer and limitations in current sequencing technology, current cancer phylogeny inference methods identify a large solution space of plausible phylogenies. To facilitate further downstream analyses, methods that accurately summarize such a set [Formula: see text] of cancer phylogenies are imperative. However, current summary methods are limited to a single consensus tree or graph and may miss important topological features that are present in different subsets of candidate trees. RESULTS: We introduce the Multiple Consensus Tree (MCT) problem to simultaneously cluster [Formula: see text] and infer a consensus tree for each cluster. We show that MCT is NP-hard, and present an exact algorithm based on mixed integer linear programming (MILP). In addition, we introduce a heuristic algorithm that efficiently identifies high-quality consensus trees, recovering all optimal solutions identified by the MILP in simulated data at a fraction of the time. We demonstrate the applicability of our methods on both simulated and real data, showing that our approach selects the number of clusters depending on the complexity of the solution space [Formula: see text]. AVAILABILITY AND IMPLEMENTATION: https://github.com/elkebir-group/MCT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6612807
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-66128072019-07-12 Summarizing the solution space in tumor phylogeny inference by multiple consensus trees Aguse, Nuraini Qi, Yuanyuan El-Kebir, Mohammed Bioinformatics Ismb/Eccb 2019 Conference Proceedings MOTIVATION: Cancer phylogenies are key to studying tumorigenesis and have clinical implications. Due to the heterogeneous nature of cancer and limitations in current sequencing technology, current cancer phylogeny inference methods identify a large solution space of plausible phylogenies. To facilitate further downstream analyses, methods that accurately summarize such a set [Formula: see text] of cancer phylogenies are imperative. However, current summary methods are limited to a single consensus tree or graph and may miss important topological features that are present in different subsets of candidate trees. RESULTS: We introduce the Multiple Consensus Tree (MCT) problem to simultaneously cluster [Formula: see text] and infer a consensus tree for each cluster. We show that MCT is NP-hard, and present an exact algorithm based on mixed integer linear programming (MILP). In addition, we introduce a heuristic algorithm that efficiently identifies high-quality consensus trees, recovering all optimal solutions identified by the MILP in simulated data at a fraction of the time. We demonstrate the applicability of our methods on both simulated and real data, showing that our approach selects the number of clusters depending on the complexity of the solution space [Formula: see text]. AVAILABILITY AND IMPLEMENTATION: https://github.com/elkebir-group/MCT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07 2019-07-05 /pmc/articles/PMC6612807/ /pubmed/31510657 http://dx.doi.org/10.1093/bioinformatics/btz312 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb/Eccb 2019 Conference Proceedings
Aguse, Nuraini
Qi, Yuanyuan
El-Kebir, Mohammed
Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title_full Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title_fullStr Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title_full_unstemmed Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title_short Summarizing the solution space in tumor phylogeny inference by multiple consensus trees
title_sort summarizing the solution space in tumor phylogeny inference by multiple consensus trees
topic Ismb/Eccb 2019 Conference Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612807/
https://www.ncbi.nlm.nih.gov/pubmed/31510657
http://dx.doi.org/10.1093/bioinformatics/btz312
work_keys_str_mv AT agusenuraini summarizingthesolutionspaceintumorphylogenyinferencebymultipleconsensustrees
AT qiyuanyuan summarizingthesolutionspaceintumorphylogenyinferencebymultipleconsensustrees
AT elkebirmohammed summarizingthesolutionspaceintumorphylogenyinferencebymultipleconsensustrees