Cargando…
Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are sep...
Autores principales: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081726/ https://www.ncbi.nlm.nih.gov/pubmed/33911078 http://dx.doi.org/10.1038/s41467-020-20536-y |
_version_ | 1783685704164835328 |
---|---|
author | Kronenberg, Zev N. Rhie, Arang Koren, Sergey Concepcion, Gregory T. Peluso, Paul Munson, Katherine M. Porubsky, David Kuhn, Kristen Mueller, Kathryn A. Low, Wai Yee Hiendleder, Stefan Fedrigo, Olivier Liachko, Ivan Hall, Richard J. Phillippy, Adam M. Eichler, Evan E. Williams, John L. Smith, Timothy P. L. Jarvis, Erich D. Sullivan, Shawn T. Kingan, Sarah B. |
author_facet | Kronenberg, Zev N. Rhie, Arang Koren, Sergey Concepcion, Gregory T. Peluso, Paul Munson, Katherine M. Porubsky, David Kuhn, Kristen Mueller, Kathryn A. Low, Wai Yee Hiendleder, Stefan Fedrigo, Olivier Liachko, Ivan Hall, Richard J. Phillippy, Adam M. Eichler, Evan E. Williams, John L. Smith, Timothy P. L. Jarvis, Erich D. Sullivan, Shawn T. Kingan, Sarah B. |
author_sort | Kronenberg, Zev N. |
collection | PubMed |
description | Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs. |
format | Online Article Text |
id | pubmed-8081726 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-80817262021-05-11 Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C Kronenberg, Zev N. Rhie, Arang Koren, Sergey Concepcion, Gregory T. Peluso, Paul Munson, Katherine M. Porubsky, David Kuhn, Kristen Mueller, Kathryn A. Low, Wai Yee Hiendleder, Stefan Fedrigo, Olivier Liachko, Ivan Hall, Richard J. Phillippy, Adam M. Eichler, Evan E. Williams, John L. Smith, Timothy P. L. Jarvis, Erich D. Sullivan, Shawn T. Kingan, Sarah B. Nat Commun Article Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs. Nature Publishing Group UK 2021-04-28 /pmc/articles/PMC8081726/ /pubmed/33911078 http://dx.doi.org/10.1038/s41467-020-20536-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Kronenberg, Zev N. Rhie, Arang Koren, Sergey Concepcion, Gregory T. Peluso, Paul Munson, Katherine M. Porubsky, David Kuhn, Kristen Mueller, Kathryn A. Low, Wai Yee Hiendleder, Stefan Fedrigo, Olivier Liachko, Ivan Hall, Richard J. Phillippy, Adam M. Eichler, Evan E. Williams, John L. Smith, Timothy P. L. Jarvis, Erich D. Sullivan, Shawn T. Kingan, Sarah B. Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title | Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title_full | Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title_fullStr | Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title_full_unstemmed | Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title_short | Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C |
title_sort | extended haplotype-phasing of long-read de novo genome assemblies using hi-c |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081726/ https://www.ncbi.nlm.nih.gov/pubmed/33911078 http://dx.doi.org/10.1038/s41467-020-20536-y |
work_keys_str_mv | AT kronenbergzevn extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT rhiearang extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT korensergey extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT concepciongregoryt extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT pelusopaul extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT munsonkatherinem extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT porubskydavid extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT kuhnkristen extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT muellerkathryna extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT lowwaiyee extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT hiendlederstefan extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT fedrigoolivier extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT liachkoivan extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT hallrichardj extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT phillippyadamm extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT eichlerevane extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT williamsjohnl extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT smithtimothypl extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT jarviserichd extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT sullivanshawnt extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic AT kingansarahb extendedhaplotypephasingoflongreaddenovogenomeassembliesusinghic |