Cargando…

The Dark Matter of Large Cereal Genomes: Long Tandem Repeats

Reference genomes of important cereals, including barley, emmer wheat and bread wheat, were released recently. Their comparison with genome size estimates obtained by flow cytometry indicated that the assemblies represent not more than 88–98% of the complete genome. This work is aimed at identifying...

Descripción completa

Detalles Bibliográficos
Autores principales: Kapustová, Veronika, Tulpová, Zuzana, Toegelová, Helena, Novák, Petr, Macas, Jiří, Karafiátová, Miroslava, Hřibová, Eva, Doležel, Jaroslav, Šimková, Hana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6567227/
https://www.ncbi.nlm.nih.gov/pubmed/31137466
http://dx.doi.org/10.3390/ijms20102483
_version_ 1783427028190494720
author Kapustová, Veronika
Tulpová, Zuzana
Toegelová, Helena
Novák, Petr
Macas, Jiří
Karafiátová, Miroslava
Hřibová, Eva
Doležel, Jaroslav
Šimková, Hana
author_facet Kapustová, Veronika
Tulpová, Zuzana
Toegelová, Helena
Novák, Petr
Macas, Jiří
Karafiátová, Miroslava
Hřibová, Eva
Doležel, Jaroslav
Šimková, Hana
author_sort Kapustová, Veronika
collection PubMed
description Reference genomes of important cereals, including barley, emmer wheat and bread wheat, were released recently. Their comparison with genome size estimates obtained by flow cytometry indicated that the assemblies represent not more than 88–98% of the complete genome. This work is aimed at identifying the missing parts in two cereal genomes and proposing techniques to make the assemblies more complete. We focused on tandemly organised repetitive sequences, known to be underrepresented in genome assemblies generated from short-read sequence data. Our study found arrays of three tandem repeats with unit sizes of 1242 to 2726 bp present in the bread wheat reference genome generated from short reads. However, this and another wheat genome assembly employing long PacBio reads failed in integrating correctly the 2726-bp repeat in the pseudomolecule context. This suggests that tandem repeats of this size, frequently incorporated in unassigned scaffolds, may contribute to shrinking of pseudomolecules without reducing size of the entire assembly. We demonstrate how this missing information may be added to the pseudomolecules with the aid of nanopore sequencing of individual BAC clones and optical mapping. Using the latter technique, we identified and localised a 470-kb long array of 45S ribosomal DNA absent from the reference genome of barley.
format Online
Article
Text
id pubmed-6567227
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-65672272019-06-17 The Dark Matter of Large Cereal Genomes: Long Tandem Repeats Kapustová, Veronika Tulpová, Zuzana Toegelová, Helena Novák, Petr Macas, Jiří Karafiátová, Miroslava Hřibová, Eva Doležel, Jaroslav Šimková, Hana Int J Mol Sci Article Reference genomes of important cereals, including barley, emmer wheat and bread wheat, were released recently. Their comparison with genome size estimates obtained by flow cytometry indicated that the assemblies represent not more than 88–98% of the complete genome. This work is aimed at identifying the missing parts in two cereal genomes and proposing techniques to make the assemblies more complete. We focused on tandemly organised repetitive sequences, known to be underrepresented in genome assemblies generated from short-read sequence data. Our study found arrays of three tandem repeats with unit sizes of 1242 to 2726 bp present in the bread wheat reference genome generated from short reads. However, this and another wheat genome assembly employing long PacBio reads failed in integrating correctly the 2726-bp repeat in the pseudomolecule context. This suggests that tandem repeats of this size, frequently incorporated in unassigned scaffolds, may contribute to shrinking of pseudomolecules without reducing size of the entire assembly. We demonstrate how this missing information may be added to the pseudomolecules with the aid of nanopore sequencing of individual BAC clones and optical mapping. Using the latter technique, we identified and localised a 470-kb long array of 45S ribosomal DNA absent from the reference genome of barley. MDPI 2019-05-20 /pmc/articles/PMC6567227/ /pubmed/31137466 http://dx.doi.org/10.3390/ijms20102483 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kapustová, Veronika
Tulpová, Zuzana
Toegelová, Helena
Novák, Petr
Macas, Jiří
Karafiátová, Miroslava
Hřibová, Eva
Doležel, Jaroslav
Šimková, Hana
The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title_full The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title_fullStr The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title_full_unstemmed The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title_short The Dark Matter of Large Cereal Genomes: Long Tandem Repeats
title_sort dark matter of large cereal genomes: long tandem repeats
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6567227/
https://www.ncbi.nlm.nih.gov/pubmed/31137466
http://dx.doi.org/10.3390/ijms20102483
work_keys_str_mv AT kapustovaveronika thedarkmatteroflargecerealgenomeslongtandemrepeats
AT tulpovazuzana thedarkmatteroflargecerealgenomeslongtandemrepeats
AT toegelovahelena thedarkmatteroflargecerealgenomeslongtandemrepeats
AT novakpetr thedarkmatteroflargecerealgenomeslongtandemrepeats
AT macasjiri thedarkmatteroflargecerealgenomeslongtandemrepeats
AT karafiatovamiroslava thedarkmatteroflargecerealgenomeslongtandemrepeats
AT hribovaeva thedarkmatteroflargecerealgenomeslongtandemrepeats
AT dolezeljaroslav thedarkmatteroflargecerealgenomeslongtandemrepeats
AT simkovahana thedarkmatteroflargecerealgenomeslongtandemrepeats
AT kapustovaveronika darkmatteroflargecerealgenomeslongtandemrepeats
AT tulpovazuzana darkmatteroflargecerealgenomeslongtandemrepeats
AT toegelovahelena darkmatteroflargecerealgenomeslongtandemrepeats
AT novakpetr darkmatteroflargecerealgenomeslongtandemrepeats
AT macasjiri darkmatteroflargecerealgenomeslongtandemrepeats
AT karafiatovamiroslava darkmatteroflargecerealgenomeslongtandemrepeats
AT hribovaeva darkmatteroflargecerealgenomeslongtandemrepeats
AT dolezeljaroslav darkmatteroflargecerealgenomeslongtandemrepeats
AT simkovahana darkmatteroflargecerealgenomeslongtandemrepeats