Cargando…

Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are sep...

Descripción completa

Detalles Bibliográficos
Autores principales: Kronenberg, Zev N., Rhie, Arang, Koren, Sergey, Concepcion, Gregory T., Peluso, Paul, Munson, Katherine M., Porubsky, David, Kuhn, Kristen, Mueller, Kathryn A., Low, Wai Yee, Hiendleder, Stefan, Fedrigo, Olivier, Liachko, Ivan, Hall, Richard J., Phillippy, Adam M., Eichler, Evan E., Williams, John L., Smith, Timothy P. L., Jarvis, Erich D., Sullivan, Shawn T., Kingan, Sarah B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081726/
https://www.ncbi.nlm.nih.gov/pubmed/33911078
http://dx.doi.org/10.1038/s41467-020-20536-y
_version_ 1783685704164835328
author Kronenberg, Zev N.
Rhie, Arang
Koren, Sergey
Concepcion, Gregory T.
Peluso, Paul
Munson, Katherine M.
Porubsky, David
Kuhn, Kristen
Mueller, Kathryn A.
Low, Wai Yee
Hiendleder, Stefan
Fedrigo, Olivier
Liachko, Ivan
Hall, Richard J.
Phillippy, Adam M.
Eichler, Evan E.
Williams, John L.
Smith, Timothy P. L.
Jarvis, Erich D.
Sullivan, Shawn T.
Kingan, Sarah B.
author_facet Kronenberg, Zev N.
Rhie, Arang
Koren, Sergey
Concepcion, Gregory T.
Peluso, Paul
Munson, Katherine M.
Porubsky, David
Kuhn, Kristen
Mueller, Kathryn A.
Low, Wai Yee
Hiendleder, Stefan
Fedrigo, Olivier
Liachko, Ivan
Hall, Richard J.
Phillippy, Adam M.
Eichler, Evan E.
Williams, John L.
Smith, Timothy P. L.
Jarvis, Erich D.
Sullivan, Shawn T.
Kingan, Sarah B.
author_sort Kronenberg, Zev N.
collection PubMed
description Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.
format Online
Article
Text
id pubmed-8081726
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80817262021-05-11 Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C Kronenberg, Zev N. Rhie, Arang Koren, Sergey Concepcion, Gregory T. Peluso, Paul Munson, Katherine M. Porubsky, David Kuhn, Kristen Mueller, Kathryn A. Low, Wai Yee Hiendleder, Stefan Fedrigo, Olivier Liachko, Ivan Hall, Richard J. Phillippy, Adam M. Eichler, Evan E. Williams, John L. Smith, Timothy P. L. Jarvis, Erich D. Sullivan, Shawn T. Kingan, Sarah B. Nat Commun Article Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs. Nature Publishing Group UK 2021-04-28 /pmc/articles/PMC8081726/ /pubmed/33911078 http://dx.doi.org/10.1038/s41467-020-20536-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Kronenberg, Zev N.
Rhie, Arang
Koren, Sergey
Concepcion, Gregory T.
Peluso, Paul
Munson, Katherine M.
Porubsky, David
Kuhn, Kristen
Mueller, Kathryn A.
Low, Wai Yee
Hiendleder, Stefan
Fedrigo, Olivier
Liachko, Ivan
Hall, Richard J.
Phillippy, Adam M.
Eichler, Evan E.
Williams, John L.
Smith, Timothy P. L.
Jarvis, Erich D.
Sullivan, Shawn T.
Kingan, Sarah B.
Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title_full Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title_fullStr Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title_full_unstemmed Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title_short Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
title_sort extended haplotype-phasing of long-read de novo genome assemblies using hi-c
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081726/
https://www.ncbi.nlm.nih.gov/pubmed/33911078
http://dx.doi.org/10.1038/s41467-020-20536-y
work_keys_str_mv AT kronenbergzevn extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT rhiearang extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT korensergey extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT concepciongregoryt extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT pelusopaul extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT munsonkatherinem extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT porubskydavid extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT kuhnkristen extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT muellerkathryna extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT lowwaiyee extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT hiendlederstefan extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT fedrigoolivier extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT liachkoivan extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT hallrichardj extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT phillippyadamm extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT eichlerevane extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT williamsjohnl extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT smithtimothypl extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT jarviserichd extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT sullivanshawnt extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic
AT kingansarahb extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic