Cargando…

Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing

BACKGROUND: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS: Here, we systematically assessed the utility...

Descripción completa

Detalles Bibliográficos
Autores principales: van Heesch, Sebastiaan, Kloosterman, Wigard P, Lansu, Nico, Ruzius, Frans-Paul, Levandowsky, Elizabeth, Lee, Clarence C, Zhou, Shiguo, Goldstein, Steve, Schwartz, David C, Harkins, Timothy T, Guryev, Victor, Cuppen, Edwin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3648348/
https://www.ncbi.nlm.nih.gov/pubmed/23590730
http://dx.doi.org/10.1186/1471-2164-14-257
_version_ 1782268823007657984
author van Heesch, Sebastiaan
Kloosterman, Wigard P
Lansu, Nico
Ruzius, Frans-Paul
Levandowsky, Elizabeth
Lee, Clarence C
Zhou, Shiguo
Goldstein, Steve
Schwartz, David C
Harkins, Timothy T
Guryev, Victor
Cuppen, Edwin
author_facet van Heesch, Sebastiaan
Kloosterman, Wigard P
Lansu, Nico
Ruzius, Frans-Paul
Levandowsky, Elizabeth
Lee, Clarence C
Zhou, Shiguo
Goldstein, Steve
Schwartz, David C
Harkins, Timothy T
Guryev, Victor
Cuppen, Edwin
author_sort van Heesch, Sebastiaan
collection PubMed
description BACKGROUND: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS: Here, we systematically assessed the utility of paired-end and mate-pair (MP) next-generation sequencing libraries with insert sizes ranging from 170 bp to 25 kb, for genome coverage and for improving scaffolding of a mammalian genome (Rattus norvegicus). Despite a lower library complexity, large insert MP libraries (20 or 25 kb) provided very high physical genome coverage and were found to efficiently span repeat elements in the genome. Medium-sized (5, 8 or 15 kb) MP libraries were much more efficient for genome structure analysis than the more commonly used shorter insert paired-end and 3 kb MP libraries. Furthermore, the combination of medium- and large insert libraries resulted in a 3-fold increase in N50 in scaffolding processes. Finally, we show that our data can be used to evaluate and improve contig order and orientation in the current rat reference genome assembly. CONCLUSIONS: We conclude that applying combinations of mate-pair libraries with insert sizes that match the distributions of repetitive elements improves contig scaffolding and can contribute to the finishing of draft genomes.
format Online
Article
Text
id pubmed-3648348
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36483482013-05-09 Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing van Heesch, Sebastiaan Kloosterman, Wigard P Lansu, Nico Ruzius, Frans-Paul Levandowsky, Elizabeth Lee, Clarence C Zhou, Shiguo Goldstein, Steve Schwartz, David C Harkins, Timothy T Guryev, Victor Cuppen, Edwin BMC Genomics Research Article BACKGROUND: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS: Here, we systematically assessed the utility of paired-end and mate-pair (MP) next-generation sequencing libraries with insert sizes ranging from 170 bp to 25 kb, for genome coverage and for improving scaffolding of a mammalian genome (Rattus norvegicus). Despite a lower library complexity, large insert MP libraries (20 or 25 kb) provided very high physical genome coverage and were found to efficiently span repeat elements in the genome. Medium-sized (5, 8 or 15 kb) MP libraries were much more efficient for genome structure analysis than the more commonly used shorter insert paired-end and 3 kb MP libraries. Furthermore, the combination of medium- and large insert libraries resulted in a 3-fold increase in N50 in scaffolding processes. Finally, we show that our data can be used to evaluate and improve contig order and orientation in the current rat reference genome assembly. CONCLUSIONS: We conclude that applying combinations of mate-pair libraries with insert sizes that match the distributions of repetitive elements improves contig scaffolding and can contribute to the finishing of draft genomes. BioMed Central 2013-04-16 /pmc/articles/PMC3648348/ /pubmed/23590730 http://dx.doi.org/10.1186/1471-2164-14-257 Text en Copyright © 2013 van Heesch et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
van Heesch, Sebastiaan
Kloosterman, Wigard P
Lansu, Nico
Ruzius, Frans-Paul
Levandowsky, Elizabeth
Lee, Clarence C
Zhou, Shiguo
Goldstein, Steve
Schwartz, David C
Harkins, Timothy T
Guryev, Victor
Cuppen, Edwin
Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title_full Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title_fullStr Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title_full_unstemmed Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title_short Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
title_sort improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3648348/
https://www.ncbi.nlm.nih.gov/pubmed/23590730
http://dx.doi.org/10.1186/1471-2164-14-257
work_keys_str_mv AT vanheeschsebastiaan improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT kloostermanwigardp improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT lansunico improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT ruziusfranspaul improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT levandowskyelizabeth improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT leeclarencec improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT zhoushiguo improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT goldsteinsteve improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT schwartzdavidc improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT harkinstimothyt improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT guryevvictor improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing
AT cuppenedwin improvingmammaliangenomescaffoldingusinglargeinsertmatepairnextgenerationsequencing