Cargando…

The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly

BACKGROUND: Reptiles are a species-rich group with great phenotypic and life history diversity but are highly underrepresented among the vertebrate species with sequenced genomes. RESULTS: Here, we report a high-quality genome assembly of the tegu lizard, Salvator merianae, the first lacertoid with...

Descripción completa

Detalles Bibliográficos
Autores principales: Roscito, Juliana G, Sameith, Katrin, Pippel, Martin, Francoijs, Kees-Jan, Winkler, Sylke, Dahl, Andreas, Papoutsoglou, Georg, Myers, Gene, Hiller, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6304105/
https://www.ncbi.nlm.nih.gov/pubmed/30481296
http://dx.doi.org/10.1093/gigascience/giy141
_version_ 1783382287779364864
author Roscito, Juliana G
Sameith, Katrin
Pippel, Martin
Francoijs, Kees-Jan
Winkler, Sylke
Dahl, Andreas
Papoutsoglou, Georg
Myers, Gene
Hiller, Michael
author_facet Roscito, Juliana G
Sameith, Katrin
Pippel, Martin
Francoijs, Kees-Jan
Winkler, Sylke
Dahl, Andreas
Papoutsoglou, Georg
Myers, Gene
Hiller, Michael
author_sort Roscito, Juliana G
collection PubMed
description BACKGROUND: Reptiles are a species-rich group with great phenotypic and life history diversity but are highly underrepresented among the vertebrate species with sequenced genomes. RESULTS: Here, we report a high-quality genome assembly of the tegu lizard, Salvator merianae, the first lacertoid with a sequenced genome. We combined 74X Illumina short-read, 29.8X Pacific Biosciences long-read, and optical mapping data to generate a high-quality assembly with a scaffold N50 value of 55.4 Mb. The contig N50 value of this assembly is 521 Kb, making it the most contiguous reptile assembly so far. We show that the tegu assembly has the highest completeness of coding genes and conserved non-exonic elements (CNEs) compared to other reptiles. Furthermore, the tegu assembly has the highest number of evolutionarily conserved CNE pairs, corroborating a high assembly contiguity in intergenic regions. As in other reptiles, long interspersed nuclear elements comprise the most abundant transposon class. We used transcriptomic data, homology- and de novo gene predictions to annotate 22,413 coding genes, of which 16,995 (76%) likely have human orthologs as inferred by CESAR-derived gene mappings. Finally, we generated a multiple genome alignment comprising 10 squamates and 7 other amniote species and identified conserved regions that are under evolutionary constraint. CNEs cover 38 Mb (1.8%) of the tegu genome, with 3.3 Mb in these elements being squamate specific. In contrast to placental mammal-specific CNEs, very few of these squamate-specific CNEs (<20 Kb) overlap transposons, highlighting a difference in how lineage-specific CNEs originated in these two clades. CONCLUSIONS: The tegu lizard genome together with the multiple genome alignment and comprehensive conserved element datasets provide a valuable resource for comparative genomic studies of reptiles and other amniotes.
format Online
Article
Text
id pubmed-6304105
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63041052018-12-27 The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly Roscito, Juliana G Sameith, Katrin Pippel, Martin Francoijs, Kees-Jan Winkler, Sylke Dahl, Andreas Papoutsoglou, Georg Myers, Gene Hiller, Michael Gigascience Data Note BACKGROUND: Reptiles are a species-rich group with great phenotypic and life history diversity but are highly underrepresented among the vertebrate species with sequenced genomes. RESULTS: Here, we report a high-quality genome assembly of the tegu lizard, Salvator merianae, the first lacertoid with a sequenced genome. We combined 74X Illumina short-read, 29.8X Pacific Biosciences long-read, and optical mapping data to generate a high-quality assembly with a scaffold N50 value of 55.4 Mb. The contig N50 value of this assembly is 521 Kb, making it the most contiguous reptile assembly so far. We show that the tegu assembly has the highest completeness of coding genes and conserved non-exonic elements (CNEs) compared to other reptiles. Furthermore, the tegu assembly has the highest number of evolutionarily conserved CNE pairs, corroborating a high assembly contiguity in intergenic regions. As in other reptiles, long interspersed nuclear elements comprise the most abundant transposon class. We used transcriptomic data, homology- and de novo gene predictions to annotate 22,413 coding genes, of which 16,995 (76%) likely have human orthologs as inferred by CESAR-derived gene mappings. Finally, we generated a multiple genome alignment comprising 10 squamates and 7 other amniote species and identified conserved regions that are under evolutionary constraint. CNEs cover 38 Mb (1.8%) of the tegu genome, with 3.3 Mb in these elements being squamate specific. In contrast to placental mammal-specific CNEs, very few of these squamate-specific CNEs (<20 Kb) overlap transposons, highlighting a difference in how lineage-specific CNEs originated in these two clades. CONCLUSIONS: The tegu lizard genome together with the multiple genome alignment and comprehensive conserved element datasets provide a valuable resource for comparative genomic studies of reptiles and other amniotes. Oxford University Press 2018-11-27 /pmc/articles/PMC6304105/ /pubmed/30481296 http://dx.doi.org/10.1093/gigascience/giy141 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
Roscito, Juliana G
Sameith, Katrin
Pippel, Martin
Francoijs, Kees-Jan
Winkler, Sylke
Dahl, Andreas
Papoutsoglou, Georg
Myers, Gene
Hiller, Michael
The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title_full The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title_fullStr The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title_full_unstemmed The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title_short The genome of the tegu lizard Salvator merianae: combining Illumina, PacBio, and optical mapping data to generate a highly contiguous assembly
title_sort genome of the tegu lizard salvator merianae: combining illumina, pacbio, and optical mapping data to generate a highly contiguous assembly
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6304105/
https://www.ncbi.nlm.nih.gov/pubmed/30481296
http://dx.doi.org/10.1093/gigascience/giy141
work_keys_str_mv AT roscitojulianag thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT sameithkatrin thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT pippelmartin thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT francoijskeesjan thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT winklersylke thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT dahlandreas thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT papoutsoglougeorg thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT myersgene thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT hillermichael thegenomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT roscitojulianag genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT sameithkatrin genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT pippelmartin genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT francoijskeesjan genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT winklersylke genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT dahlandreas genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT papoutsoglougeorg genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT myersgene genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly
AT hillermichael genomeofthetegulizardsalvatormerianaecombiningilluminapacbioandopticalmappingdatatogenerateahighlycontiguousassembly