Cargando…

Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding

DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation se...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Chen, He, Jingni, Mak, Lauren, Perera, Deshan, Kwok, Devin, Wang, Jia, Li, Minghao, Mourier, Tobias, Gavriliuc, Stefan, Greenberg, Matthew, Morrissy, A Sorana, Sycuro, Laura K, Yang, Guang, Jeffares, Daniel C, Long, Quan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8136496/
https://www.ncbi.nlm.nih.gov/pubmed/33547786
http://dx.doi.org/10.1093/molbev/msab037
_version_ 1783695455210700800
author Cao, Chen
He, Jingni
Mak, Lauren
Perera, Deshan
Kwok, Devin
Wang, Jia
Li, Minghao
Mourier, Tobias
Gavriliuc, Stefan
Greenberg, Matthew
Morrissy, A Sorana
Sycuro, Laura K
Yang, Guang
Jeffares, Daniel C
Long, Quan
author_facet Cao, Chen
He, Jingni
Mak, Lauren
Perera, Deshan
Kwok, Devin
Wang, Jia
Li, Minghao
Mourier, Tobias
Gavriliuc, Stefan
Greenberg, Matthew
Morrissy, A Sorana
Sycuro, Laura K
Yang, Guang
Jeffares, Daniel C
Long, Quan
author_sort Cao, Chen
collection PubMed
description DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.
format Online
Article
Text
id pubmed-8136496
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81364962021-05-25 Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding Cao, Chen He, Jingni Mak, Lauren Perera, Deshan Kwok, Devin Wang, Jia Li, Minghao Mourier, Tobias Gavriliuc, Stefan Greenberg, Matthew Morrissy, A Sorana Sycuro, Laura K Yang, Guang Jeffares, Daniel C Long, Quan Mol Biol Evol Methods DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis. Oxford University Press 2021-02-06 /pmc/articles/PMC8136496/ /pubmed/33547786 http://dx.doi.org/10.1093/molbev/msab037 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Cao, Chen
He, Jingni
Mak, Lauren
Perera, Deshan
Kwok, Devin
Wang, Jia
Li, Minghao
Mourier, Tobias
Gavriliuc, Stefan
Greenberg, Matthew
Morrissy, A Sorana
Sycuro, Laura K
Yang, Guang
Jeffares, Daniel C
Long, Quan
Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title_full Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title_fullStr Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title_full_unstemmed Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title_short Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding
title_sort reconstruction of microbial haplotypes by integration of statistical and physical linkage in scaffolding
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8136496/
https://www.ncbi.nlm.nih.gov/pubmed/33547786
http://dx.doi.org/10.1093/molbev/msab037
work_keys_str_mv AT caochen reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT hejingni reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT maklauren reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT pereradeshan reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT kwokdevin reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT wangjia reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT liminghao reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT mouriertobias reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT gavriliucstefan reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT greenbergmatthew reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT morrissyasorana reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT sycurolaurak reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT yangguang reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT jeffaresdanielc reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding
AT longquan reconstructionofmicrobialhaplotypesbyintegrationofstatisticalandphysicallinkageinscaffolding