Cargando…

Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation

BACKGROUND: The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 k...

Descripción completa

Detalles Bibliográficos
Autores principales: Asan, Geng, Chunyu, Chen, Yan, Wu, Kui, Cai, Qingle, Wang, Yu, Lang, Yongshan, Cao, Hongzhi, Yang, Huangming, Wang, Jian, Zhang, Xiuqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3459883/
https://www.ncbi.nlm.nih.gov/pubmed/23029438
http://dx.doi.org/10.1371/journal.pone.0046211
_version_ 1782244874983047168
author Asan,
Geng, Chunyu
Chen, Yan
Wu, Kui
Cai, Qingle
Wang, Yu
Lang, Yongshan
Cao, Hongzhi
Yang, Huangming
Wang, Jian
Zhang, Xiuqing
author_facet Asan,
Geng, Chunyu
Chen, Yan
Wu, Kui
Cai, Qingle
Wang, Yu
Lang, Yongshan
Cao, Hongzhi
Yang, Huangming
Wang, Jian
Zhang, Xiuqing
author_sort Asan,
collection PubMed
description BACKGROUND: The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS. RESULTS: We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data. CONCLUSIONS: In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads.
format Online
Article
Text
id pubmed-3459883
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34598832012-10-01 Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation Asan, Geng, Chunyu Chen, Yan Wu, Kui Cai, Qingle Wang, Yu Lang, Yongshan Cao, Hongzhi Yang, Huangming Wang, Jian Zhang, Xiuqing PLoS One Research Article BACKGROUND: The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS. RESULTS: We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data. CONCLUSIONS: In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads. Public Library of Science 2012-09-27 /pmc/articles/PMC3459883/ /pubmed/23029438 http://dx.doi.org/10.1371/journal.pone.0046211 Text en © 2012 Asan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Asan,
Geng, Chunyu
Chen, Yan
Wu, Kui
Cai, Qingle
Wang, Yu
Lang, Yongshan
Cao, Hongzhi
Yang, Huangming
Wang, Jian
Zhang, Xiuqing
Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title_full Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title_fullStr Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title_full_unstemmed Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title_short Paired-End Sequencing of Long-Range DNA Fragments for De Novo Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation
title_sort paired-end sequencing of long-range dna fragments for de novo assembly of large, complex mammalian genomes by direct intra-molecule ligation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3459883/
https://www.ncbi.nlm.nih.gov/pubmed/23029438
http://dx.doi.org/10.1371/journal.pone.0046211
work_keys_str_mv AT asan pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT gengchunyu pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT chenyan pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT wukui pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT caiqingle pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT wangyu pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT langyongshan pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT caohongzhi pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT yanghuangming pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT wangjian pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation
AT zhangxiuqing pairedendsequencingoflongrangednafragmentsfordenovoassemblyoflargecomplexmammaliangenomesbydirectintramoleculeligation