Cargando…

Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information

BACKGROUND: Long DNA reads produced by single-molecule and pore-based sequencers are more suitable for assembly and structural variation discovery than short-read DNA fragments. For de novo assembly, Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) are the favorite options. Howeve...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hui-Su, Jeon, Sungwon, Kim, Changjae, Kim, Yeon Kyung, Cho, Yun Sung, Kim, Jungeun, Blazyte, Asta, Manica, Andrea, Lee, Semin, Bhak, Jong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6889754/
https://www.ncbi.nlm.nih.gov/pubmed/31794015
http://dx.doi.org/10.1093/gigascience/giz125
_version_ 1783475487647989760
author Kim, Hui-Su
Jeon, Sungwon
Kim, Changjae
Kim, Yeon Kyung
Cho, Yun Sung
Kim, Jungeun
Blazyte, Asta
Manica, Andrea
Lee, Semin
Bhak, Jong
author_facet Kim, Hui-Su
Jeon, Sungwon
Kim, Changjae
Kim, Yeon Kyung
Cho, Yun Sung
Kim, Jungeun
Blazyte, Asta
Manica, Andrea
Lee, Semin
Bhak, Jong
author_sort Kim, Hui-Su
collection PubMed
description BACKGROUND: Long DNA reads produced by single-molecule and pore-based sequencers are more suitable for assembly and structural variation discovery than short-read DNA fragments. For de novo assembly, Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) are the favorite options. However, PacBio's SMRT sequencing is expensive for a full human genome assembly and costs more than $40,000 US for 30× coverage as of 2019. ONT PromethION sequencing, on the other hand, is 1/12 the price of PacBio for the same coverage. This study aimed to compare the cost-effectiveness of ONT PromethION and PacBio's SMRT sequencing in relation to the quality. FINDINGS: We performed whole-genome de novo assemblies and comparison to construct an improved version of KOREF, the Korean reference genome, using sequencing data produced by PromethION and PacBio. With PromethION, an assembly using sequenced reads with 64× coverage (193 Gb, 3 flowcell sequencing) resulted in 3,725 contigs with N50s of 16.7 Mb and a total genome length of 2.8 Gb. It was comparable to a KOREF assembly constructed using PacBio at 62× coverage (188 Gb, 2,695 contigs, and N50s of 17.9 Mb). When we applied Hi-C–derived long-range mapping data, an even higher quality assembly for the 64× coverage was achieved, resulting in 3,179 scaffolds with an N50 of 56.4 Mb. CONCLUSION: The pore-based PromethION approach provided a high-quality chromosome-scale human genome assembly at a low cost with long maximum contig and scaffold lengths and was more cost-effective than PacBio at comparable quality measurements.
format Online
Article
Text
id pubmed-6889754
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-68897542019-12-05 Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information Kim, Hui-Su Jeon, Sungwon Kim, Changjae Kim, Yeon Kyung Cho, Yun Sung Kim, Jungeun Blazyte, Asta Manica, Andrea Lee, Semin Bhak, Jong Gigascience Data Note BACKGROUND: Long DNA reads produced by single-molecule and pore-based sequencers are more suitable for assembly and structural variation discovery than short-read DNA fragments. For de novo assembly, Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) are the favorite options. However, PacBio's SMRT sequencing is expensive for a full human genome assembly and costs more than $40,000 US for 30× coverage as of 2019. ONT PromethION sequencing, on the other hand, is 1/12 the price of PacBio for the same coverage. This study aimed to compare the cost-effectiveness of ONT PromethION and PacBio's SMRT sequencing in relation to the quality. FINDINGS: We performed whole-genome de novo assemblies and comparison to construct an improved version of KOREF, the Korean reference genome, using sequencing data produced by PromethION and PacBio. With PromethION, an assembly using sequenced reads with 64× coverage (193 Gb, 3 flowcell sequencing) resulted in 3,725 contigs with N50s of 16.7 Mb and a total genome length of 2.8 Gb. It was comparable to a KOREF assembly constructed using PacBio at 62× coverage (188 Gb, 2,695 contigs, and N50s of 17.9 Mb). When we applied Hi-C–derived long-range mapping data, an even higher quality assembly for the 64× coverage was achieved, resulting in 3,179 scaffolds with an N50 of 56.4 Mb. CONCLUSION: The pore-based PromethION approach provided a high-quality chromosome-scale human genome assembly at a low cost with long maximum contig and scaffold lengths and was more cost-effective than PacBio at comparable quality measurements. Oxford University Press 2019-12-03 /pmc/articles/PMC6889754/ /pubmed/31794015 http://dx.doi.org/10.1093/gigascience/giz125 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
Kim, Hui-Su
Jeon, Sungwon
Kim, Changjae
Kim, Yeon Kyung
Cho, Yun Sung
Kim, Jungeun
Blazyte, Asta
Manica, Andrea
Lee, Semin
Bhak, Jong
Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title_full Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title_fullStr Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title_full_unstemmed Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title_short Chromosome-scale assembly comparison of the Korean Reference Genome KOREF from PromethION and PacBio with Hi-C mapping information
title_sort chromosome-scale assembly comparison of the korean reference genome koref from promethion and pacbio with hi-c mapping information
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6889754/
https://www.ncbi.nlm.nih.gov/pubmed/31794015
http://dx.doi.org/10.1093/gigascience/giz125
work_keys_str_mv AT kimhuisu chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT jeonsungwon chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT kimchangjae chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT kimyeonkyung chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT choyunsung chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT kimjungeun chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT blazyteasta chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT manicaandrea chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT leesemin chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation
AT bhakjong chromosomescaleassemblycomparisonofthekoreanreferencegenomekoreffrompromethionandpacbiowithhicmappinginformation