Cargando…

Clonal reconstruction from time course genomic sequencing data

BACKGROUND: Bacterial cells during many replication cycles accumulate spontaneous mutations, which result in the birth of novel clones. As a result of this clonal expansion, an evolving bacterial population has different clonal composition over time, as revealed in the long-term evolution experiment...

Descripción completa

Detalles Bibliográficos
Autores principales: Ismail, Wazim Mohammed, Tang, Haixu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936074/
https://www.ncbi.nlm.nih.gov/pubmed/31888455
http://dx.doi.org/10.1186/s12864-019-6328-3
_version_ 1783483678254432256
author Ismail, Wazim Mohammed
Tang, Haixu
author_facet Ismail, Wazim Mohammed
Tang, Haixu
author_sort Ismail, Wazim Mohammed
collection PubMed
description BACKGROUND: Bacterial cells during many replication cycles accumulate spontaneous mutations, which result in the birth of novel clones. As a result of this clonal expansion, an evolving bacterial population has different clonal composition over time, as revealed in the long-term evolution experiments (LTEEs). Accurately inferring the haplotypes of novel clones as well as the clonal frequencies and the clonal evolutionary history in a bacterial population is useful for the characterization of the evolutionary pressure on multiple correlated mutations instead of that on individual mutations. RESULTS: In this paper, we study the computational problem of reconstructing the haplotypes of bacterial clones from the variant allele frequencies observed from an evolving bacterial population at multiple time points. We formalize the problem using a maximum likelihood function, which is defined under the assumption that mutations occur spontaneously, and thus the likelihood of a mutation occurring in a specific clone is proportional to the frequency of the clone in the population when the mutation occurs. We develop a series of heuristic algorithms to address the maximum likelihood inference, and show through simulation experiments that the algorithms are fast and achieve near optimal accuracy that is practically plausible under the maximum likelihood framework. We also validate our method using experimental data obtained from a recent study on long-term evolution of Escherichia coli. CONCLUSION: We developed efficient algorithms to reconstruct the clonal evolution history from time course genomic sequencing data. Our algorithm can also incorporate clonal sequencing data to improve the reconstruction results when they are available. Based on the evaluation on both simulated and experimental sequencing data, our algorithms can achieve satisfactory results on the genome sequencing data from long-term evolution experiments. AVAILABILITY: The program (ClonalTREE) is available as open-source software on GitHub at https://github.com/COL-IU/ClonalTREE.
format Online
Article
Text
id pubmed-6936074
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69360742019-12-31 Clonal reconstruction from time course genomic sequencing data Ismail, Wazim Mohammed Tang, Haixu BMC Genomics Research BACKGROUND: Bacterial cells during many replication cycles accumulate spontaneous mutations, which result in the birth of novel clones. As a result of this clonal expansion, an evolving bacterial population has different clonal composition over time, as revealed in the long-term evolution experiments (LTEEs). Accurately inferring the haplotypes of novel clones as well as the clonal frequencies and the clonal evolutionary history in a bacterial population is useful for the characterization of the evolutionary pressure on multiple correlated mutations instead of that on individual mutations. RESULTS: In this paper, we study the computational problem of reconstructing the haplotypes of bacterial clones from the variant allele frequencies observed from an evolving bacterial population at multiple time points. We formalize the problem using a maximum likelihood function, which is defined under the assumption that mutations occur spontaneously, and thus the likelihood of a mutation occurring in a specific clone is proportional to the frequency of the clone in the population when the mutation occurs. We develop a series of heuristic algorithms to address the maximum likelihood inference, and show through simulation experiments that the algorithms are fast and achieve near optimal accuracy that is practically plausible under the maximum likelihood framework. We also validate our method using experimental data obtained from a recent study on long-term evolution of Escherichia coli. CONCLUSION: We developed efficient algorithms to reconstruct the clonal evolution history from time course genomic sequencing data. Our algorithm can also incorporate clonal sequencing data to improve the reconstruction results when they are available. Based on the evaluation on both simulated and experimental sequencing data, our algorithms can achieve satisfactory results on the genome sequencing data from long-term evolution experiments. AVAILABILITY: The program (ClonalTREE) is available as open-source software on GitHub at https://github.com/COL-IU/ClonalTREE. BioMed Central 2019-12-30 /pmc/articles/PMC6936074/ /pubmed/31888455 http://dx.doi.org/10.1186/s12864-019-6328-3 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Ismail, Wazim Mohammed
Tang, Haixu
Clonal reconstruction from time course genomic sequencing data
title Clonal reconstruction from time course genomic sequencing data
title_full Clonal reconstruction from time course genomic sequencing data
title_fullStr Clonal reconstruction from time course genomic sequencing data
title_full_unstemmed Clonal reconstruction from time course genomic sequencing data
title_short Clonal reconstruction from time course genomic sequencing data
title_sort clonal reconstruction from time course genomic sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936074/
https://www.ncbi.nlm.nih.gov/pubmed/31888455
http://dx.doi.org/10.1186/s12864-019-6328-3
work_keys_str_mv AT ismailwazimmohammed clonalreconstructionfromtimecoursegenomicsequencingdata
AT tanghaixu clonalreconstructionfromtimecoursegenomicsequencingdata