Cargando…

Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing

The genome of Bordetella pertussis is complex, with high G+C content and many repeats, each longer than 1000 bp. Long-read sequencing offers the opportunity to produce single-contig B. pertussis assemblies using sequencing reads which are longer than the repetitive sections, with the potential to re...

Descripción completa

Detalles Bibliográficos
Autores principales: Ring, Natalie, Abrahams, Jonathan S., Jain, Miten, Olsen, Hugh, Preston, Andrew, Bagby, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6321869/
https://www.ncbi.nlm.nih.gov/pubmed/30461375
http://dx.doi.org/10.1099/mgen.0.000234
_version_ 1783385528920440832
author Ring, Natalie
Abrahams, Jonathan S.
Jain, Miten
Olsen, Hugh
Preston, Andrew
Bagby, Stefan
author_facet Ring, Natalie
Abrahams, Jonathan S.
Jain, Miten
Olsen, Hugh
Preston, Andrew
Bagby, Stefan
author_sort Ring, Natalie
collection PubMed
description The genome of Bordetella pertussis is complex, with high G+C content and many repeats, each longer than 1000 bp. Long-read sequencing offers the opportunity to produce single-contig B. pertussis assemblies using sequencing reads which are longer than the repetitive sections, with the potential to reveal genomic features which were previously unobservable in multi-contig assemblies produced by short-read sequencing alone. We used an R9.4 MinION flow cell and barcoding to sequence five B. pertussis strains in a single sequencing run. We then trialled combinations of the many nanopore user community-built long-read analysis tools to establish the current optimal assembly pipeline for B. pertussis genome sequences. This pipeline produced closed genome sequences for four strains, allowing visualization of inter-strain genomic rearrangement. Read mapping to the Tohama I reference genome suggests that the remaining strain contains an ultra-long duplicated region (almost 200 kbp), which was not resolved by our pipeline; further investigation also revealed that a second strain that was seemingly resolved by our pipeline may contain an even longer duplication, albeit in a small subset of cells. We have therefore demonstrated the ability to resolve the structure of several B. pertussis strains per single barcoded nanopore flow cell, but the genomes with highest complexity (e.g. very large duplicated regions) remain only partially resolved using the standard library preparation and will require an alternative library preparation method. For full strain characterization, we recommend hybrid assembly of long and short reads together; for comparison of genome arrangement, assembly using long reads alone is sufficient.
format Online
Article
Text
id pubmed-6321869
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-63218692019-02-25 Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing Ring, Natalie Abrahams, Jonathan S. Jain, Miten Olsen, Hugh Preston, Andrew Bagby, Stefan Microb Genom Research Article The genome of Bordetella pertussis is complex, with high G+C content and many repeats, each longer than 1000 bp. Long-read sequencing offers the opportunity to produce single-contig B. pertussis assemblies using sequencing reads which are longer than the repetitive sections, with the potential to reveal genomic features which were previously unobservable in multi-contig assemblies produced by short-read sequencing alone. We used an R9.4 MinION flow cell and barcoding to sequence five B. pertussis strains in a single sequencing run. We then trialled combinations of the many nanopore user community-built long-read analysis tools to establish the current optimal assembly pipeline for B. pertussis genome sequences. This pipeline produced closed genome sequences for four strains, allowing visualization of inter-strain genomic rearrangement. Read mapping to the Tohama I reference genome suggests that the remaining strain contains an ultra-long duplicated region (almost 200 kbp), which was not resolved by our pipeline; further investigation also revealed that a second strain that was seemingly resolved by our pipeline may contain an even longer duplication, albeit in a small subset of cells. We have therefore demonstrated the ability to resolve the structure of several B. pertussis strains per single barcoded nanopore flow cell, but the genomes with highest complexity (e.g. very large duplicated regions) remain only partially resolved using the standard library preparation and will require an alternative library preparation method. For full strain characterization, we recommend hybrid assembly of long and short reads together; for comparison of genome arrangement, assembly using long reads alone is sufficient. Microbiology Society 2018-11-21 /pmc/articles/PMC6321869/ /pubmed/30461375 http://dx.doi.org/10.1099/mgen.0.000234 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ring, Natalie
Abrahams, Jonathan S.
Jain, Miten
Olsen, Hugh
Preston, Andrew
Bagby, Stefan
Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title_full Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title_fullStr Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title_full_unstemmed Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title_short Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing
title_sort resolving the complex bordetella pertussis genome using barcoded nanopore sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6321869/
https://www.ncbi.nlm.nih.gov/pubmed/30461375
http://dx.doi.org/10.1099/mgen.0.000234
work_keys_str_mv AT ringnatalie resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing
AT abrahamsjonathans resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing
AT jainmiten resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing
AT olsenhugh resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing
AT prestonandrew resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing
AT bagbystefan resolvingthecomplexbordetellapertussisgenomeusingbarcodednanoporesequencing