Cargando…

High-throughput long paired-end sequencing of a Fosmid library by PacBio

BACKGROUND: Large insert paired-end sequencing technologies are important tools for assembling genomes, delineating associated breakpoints and detecting structural rearrangements. To facilitate the comprehensive detection of inter- and intra-chromosomal structural rearrangements or variants (SVs) an...

Descripción completa

Detalles Bibliográficos
Autores principales: Dai, Zhaozhao, Li, Tong, Li, Jiadong, Han, Zhifei, Pan, Yonglong, Tang, Sha, Diao, Xianmin, Luo, Meizhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6878638/
https://www.ncbi.nlm.nih.gov/pubmed/31788019
http://dx.doi.org/10.1186/s13007-019-0525-6
_version_ 1783473489228857344
author Dai, Zhaozhao
Li, Tong
Li, Jiadong
Han, Zhifei
Pan, Yonglong
Tang, Sha
Diao, Xianmin
Luo, Meizhong
author_facet Dai, Zhaozhao
Li, Tong
Li, Jiadong
Han, Zhifei
Pan, Yonglong
Tang, Sha
Diao, Xianmin
Luo, Meizhong
author_sort Dai, Zhaozhao
collection PubMed
description BACKGROUND: Large insert paired-end sequencing technologies are important tools for assembling genomes, delineating associated breakpoints and detecting structural rearrangements. To facilitate the comprehensive detection of inter- and intra-chromosomal structural rearrangements or variants (SVs) and complex genome assembly with long repeats and segmental duplications, we developed a new method based on single-molecule real-time synthesis sequencing technology for generating long paired-end sequences of large insert DNA libraries. RESULTS: A Fosmid vector, pHZAUFOS3, was developed with the following new features: (1) two 18-bp non-palindromic I-SceI sites flank the cloning site, and another two sites are present in the skeleton of the vector, allowing long DNA inserts (and the long paired-ends in this paper) to be recovered as single fragments and the vector (~ 8 kb) to be fragmented into 2–3 kb fragments by I-SceI digestion and therefore was effectively removed from the long paired-ends (5–10 kb); (2) the chloramphenicol (Cm) resistance gene and replicon (oriV), necessary for colony growth, are located near the two sides of the cloning site, helping to increase the proportion of the paired-end fragments to single-end fragments in the paired-end libraries. Paired-end libraries were constructed by ligating the size-selected, mechanically sheared pooled Fosmid DNA fragments to the Ampicillin (Amp) resistance gene fragment and screening the colonies with Cm and Amp. We tested this method on yeast and Setaria italica Yugu1. Fosmid-size paired-ends with an average length longer than 2 kb for each end were generated. The N50 scaffold lengths of the de novo assemblies of the yeast and S. italica Yugu1 genomes were significantly improved. Five large and five small structural rearrangements or assembly errors spanning tens of bp to tens of kb were identified in S. italica Yugu1 including deletions, inversions, duplications and translocations. CONCLUSIONS: We developed a new method for long paired-end sequencing of large insert libraries, which can efficiently improve the quality of de novo genome assembly and identify large and small structural rearrangements or assembly errors.
format Online
Article
Text
id pubmed-6878638
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68786382019-11-29 High-throughput long paired-end sequencing of a Fosmid library by PacBio Dai, Zhaozhao Li, Tong Li, Jiadong Han, Zhifei Pan, Yonglong Tang, Sha Diao, Xianmin Luo, Meizhong Plant Methods Methodology BACKGROUND: Large insert paired-end sequencing technologies are important tools for assembling genomes, delineating associated breakpoints and detecting structural rearrangements. To facilitate the comprehensive detection of inter- and intra-chromosomal structural rearrangements or variants (SVs) and complex genome assembly with long repeats and segmental duplications, we developed a new method based on single-molecule real-time synthesis sequencing technology for generating long paired-end sequences of large insert DNA libraries. RESULTS: A Fosmid vector, pHZAUFOS3, was developed with the following new features: (1) two 18-bp non-palindromic I-SceI sites flank the cloning site, and another two sites are present in the skeleton of the vector, allowing long DNA inserts (and the long paired-ends in this paper) to be recovered as single fragments and the vector (~ 8 kb) to be fragmented into 2–3 kb fragments by I-SceI digestion and therefore was effectively removed from the long paired-ends (5–10 kb); (2) the chloramphenicol (Cm) resistance gene and replicon (oriV), necessary for colony growth, are located near the two sides of the cloning site, helping to increase the proportion of the paired-end fragments to single-end fragments in the paired-end libraries. Paired-end libraries were constructed by ligating the size-selected, mechanically sheared pooled Fosmid DNA fragments to the Ampicillin (Amp) resistance gene fragment and screening the colonies with Cm and Amp. We tested this method on yeast and Setaria italica Yugu1. Fosmid-size paired-ends with an average length longer than 2 kb for each end were generated. The N50 scaffold lengths of the de novo assemblies of the yeast and S. italica Yugu1 genomes were significantly improved. Five large and five small structural rearrangements or assembly errors spanning tens of bp to tens of kb were identified in S. italica Yugu1 including deletions, inversions, duplications and translocations. CONCLUSIONS: We developed a new method for long paired-end sequencing of large insert libraries, which can efficiently improve the quality of de novo genome assembly and identify large and small structural rearrangements or assembly errors. BioMed Central 2019-11-26 /pmc/articles/PMC6878638/ /pubmed/31788019 http://dx.doi.org/10.1186/s13007-019-0525-6 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Dai, Zhaozhao
Li, Tong
Li, Jiadong
Han, Zhifei
Pan, Yonglong
Tang, Sha
Diao, Xianmin
Luo, Meizhong
High-throughput long paired-end sequencing of a Fosmid library by PacBio
title High-throughput long paired-end sequencing of a Fosmid library by PacBio
title_full High-throughput long paired-end sequencing of a Fosmid library by PacBio
title_fullStr High-throughput long paired-end sequencing of a Fosmid library by PacBio
title_full_unstemmed High-throughput long paired-end sequencing of a Fosmid library by PacBio
title_short High-throughput long paired-end sequencing of a Fosmid library by PacBio
title_sort high-throughput long paired-end sequencing of a fosmid library by pacbio
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6878638/
https://www.ncbi.nlm.nih.gov/pubmed/31788019
http://dx.doi.org/10.1186/s13007-019-0525-6
work_keys_str_mv AT daizhaozhao highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT litong highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT lijiadong highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT hanzhifei highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT panyonglong highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT tangsha highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT diaoxianmin highthroughputlongpairedendsequencingofafosmidlibrarybypacbio
AT luomeizhong highthroughputlongpairedendsequencingofafosmidlibrarybypacbio