Cargando…

Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1

BACKGROUND: Echinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided usef...

Descripción completa

Detalles Bibliográficos
Autores principales: Kinkar, Liina, Korhonen, Pasi K., Cai, Huimin, Gauci, Charles G., Lightowlers, Marshall W., Saarma, Urmas, Jenkins, David J., Li, Jiandong, Li, Junhua, Young, Neil D., Gasser, Robin B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6521400/
https://www.ncbi.nlm.nih.gov/pubmed/31097022
http://dx.doi.org/10.1186/s13071-019-3492-x
_version_ 1783418949352816640
author Kinkar, Liina
Korhonen, Pasi K.
Cai, Huimin
Gauci, Charles G.
Lightowlers, Marshall W.
Saarma, Urmas
Jenkins, David J.
Li, Jiandong
Li, Junhua
Young, Neil D.
Gasser, Robin B.
author_facet Kinkar, Liina
Korhonen, Pasi K.
Cai, Huimin
Gauci, Charles G.
Lightowlers, Marshall W.
Saarma, Urmas
Jenkins, David J.
Li, Jiandong
Li, Junhua
Young, Neil D.
Gasser, Robin B.
author_sort Kinkar, Liina
collection PubMed
description BACKGROUND: Echinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided useful genetic markers to explore the nature and extent of genetic diversity within Echinococcus and have underpinned phylogenetic and population structure analyses of this genus. Our recent work indicated a sequence gap (> 1 kb) in the mt genomes of E. granulosus genotype G1, which could not be determined by PCR-based Sanger sequencing. The aim of the present study was to define the complete mt genome, irrespective of structural complexities, using a long-read sequencing method. METHODS: We extracted high molecular weight genomic DNA from protoscoleces from a single cyst of E. granulosus genotype G1 from a sheep from Australia using a conventional method and sequenced it using PacBio Sequel (long-read) technology, complemented by BGISEQ-500 short-read sequencing. Sequence data obtained were assembled using a recently-developed workflow. RESULTS: We assembled a complete mt genome sequence of 17,675 bp, which is > 4 kb larger than the complete mt genomes known for E. granulosus genotype G1. This assembly includes a previously-elusive tandem repeat region, which is 4417 bp long and consists of ten near-identical 441–445 bp repeat units, each harbouring a 184 bp non-coding region and adjacent regions. We also identified a short non-coding region of 183 bp, which includes an inverted repeat. CONCLUSIONS: We report what we consider to be the first complete mt genome of E. granulosus genotype G1 and characterise all repeat regions in this genome. The numbers, sizes, sequences and functions of tandem repeat regions remain to be studied in different isolates of genotype G1 and in other genotypes and species. The discovery of such ‘new’ repeat elements in the mt genome of genotype G1 by PacBio sequencing raises a question about the completeness of some published genomes of taeniid cestodes assembled from conventional or short-read sequence datasets. This study shows that long-read sequencing readily overcomes the challenges of assembling repeat elements to achieve improved genomes.
format Online
Article
Text
id pubmed-6521400
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65214002019-05-23 Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1 Kinkar, Liina Korhonen, Pasi K. Cai, Huimin Gauci, Charles G. Lightowlers, Marshall W. Saarma, Urmas Jenkins, David J. Li, Jiandong Li, Junhua Young, Neil D. Gasser, Robin B. Parasit Vectors Research BACKGROUND: Echinococcus tapeworms cause a severe helminthic zoonosis called echinococcosis. The genus comprises various species and genotypes, of which E. granulosus (sensu stricto) represents a significant global public health and socioeconomic burden. Mitochondrial (mt) genomes have provided useful genetic markers to explore the nature and extent of genetic diversity within Echinococcus and have underpinned phylogenetic and population structure analyses of this genus. Our recent work indicated a sequence gap (> 1 kb) in the mt genomes of E. granulosus genotype G1, which could not be determined by PCR-based Sanger sequencing. The aim of the present study was to define the complete mt genome, irrespective of structural complexities, using a long-read sequencing method. METHODS: We extracted high molecular weight genomic DNA from protoscoleces from a single cyst of E. granulosus genotype G1 from a sheep from Australia using a conventional method and sequenced it using PacBio Sequel (long-read) technology, complemented by BGISEQ-500 short-read sequencing. Sequence data obtained were assembled using a recently-developed workflow. RESULTS: We assembled a complete mt genome sequence of 17,675 bp, which is > 4 kb larger than the complete mt genomes known for E. granulosus genotype G1. This assembly includes a previously-elusive tandem repeat region, which is 4417 bp long and consists of ten near-identical 441–445 bp repeat units, each harbouring a 184 bp non-coding region and adjacent regions. We also identified a short non-coding region of 183 bp, which includes an inverted repeat. CONCLUSIONS: We report what we consider to be the first complete mt genome of E. granulosus genotype G1 and characterise all repeat regions in this genome. The numbers, sizes, sequences and functions of tandem repeat regions remain to be studied in different isolates of genotype G1 and in other genotypes and species. The discovery of such ‘new’ repeat elements in the mt genome of genotype G1 by PacBio sequencing raises a question about the completeness of some published genomes of taeniid cestodes assembled from conventional or short-read sequence datasets. This study shows that long-read sequencing readily overcomes the challenges of assembling repeat elements to achieve improved genomes. BioMed Central 2019-05-16 /pmc/articles/PMC6521400/ /pubmed/31097022 http://dx.doi.org/10.1186/s13071-019-3492-x Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kinkar, Liina
Korhonen, Pasi K.
Cai, Huimin
Gauci, Charles G.
Lightowlers, Marshall W.
Saarma, Urmas
Jenkins, David J.
Li, Jiandong
Li, Junhua
Young, Neil D.
Gasser, Robin B.
Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title_full Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title_fullStr Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title_full_unstemmed Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title_short Long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of Echinococcus granulosus (sensu stricto) genotype G1
title_sort long-read sequencing reveals a 4.4 kb tandem repeat region in the mitogenome of echinococcus granulosus (sensu stricto) genotype g1
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6521400/
https://www.ncbi.nlm.nih.gov/pubmed/31097022
http://dx.doi.org/10.1186/s13071-019-3492-x
work_keys_str_mv AT kinkarliina longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT korhonenpasik longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT caihuimin longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT gaucicharlesg longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT lightowlersmarshallw longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT saarmaurmas longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT jenkinsdavidj longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT lijiandong longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT lijunhua longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT youngneild longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1
AT gasserrobinb longreadsequencingrevealsa44kbtandemrepeatregioninthemitogenomeofechinococcusgranulosussensustrictogenotypeg1