Cargando…

Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications

Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associa...

Descripción completa

Detalles Bibliográficos
Autores principales: Weissensteiner, Matthias H., Pang, Andy W.C., Bunikis, Ignas, Höijer, Ida, Vinnere-Petterson, Olga, Suh, Alexander, Wolf, Jochen B.W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411765/
https://www.ncbi.nlm.nih.gov/pubmed/28360231
http://dx.doi.org/10.1101/gr.215095.116
_version_ 1783232861286957056
author Weissensteiner, Matthias H.
Pang, Andy W.C.
Bunikis, Ignas
Höijer, Ida
Vinnere-Petterson, Olga
Suh, Alexander
Wolf, Jochen B.W.
author_facet Weissensteiner, Matthias H.
Pang, Andy W.C.
Bunikis, Ignas
Höijer, Ida
Vinnere-Petterson, Olga
Suh, Alexander
Wolf, Jochen B.W.
author_sort Weissensteiner, Matthias H.
collection PubMed
description Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and subtelomeric regions, it locally influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly [LR]) and single-molecule optical maps (optical map assembly [OM]). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing misassemblies. By combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using whole-genome population resequencing data, we estimated the population-scaled recombination rate (ρ) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three different technologies, our results highlight the importance of adding a layer of information on genome structure that is inaccessible to each approach independently.
format Online
Article
Text
id pubmed-5411765
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-54117652017-11-01 Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications Weissensteiner, Matthias H. Pang, Andy W.C. Bunikis, Ignas Höijer, Ida Vinnere-Petterson, Olga Suh, Alexander Wolf, Jochen B.W. Genome Res Research Accurate and contiguous genome assembly is key to a comprehensive understanding of the processes shaping genomic diversity and evolution. Yet, it is frequently constrained by constitutive heterochromatin, usually characterized by highly repetitive DNA. As a key feature of genome architecture associated with centromeric and subtelomeric regions, it locally influences meiotic recombination. In this study, we assess the impact of large tandem repeat arrays on the recombination rate landscape in an avian speciation model, the Eurasian crow. We assembled two high-quality genome references using single-molecule real-time sequencing (long-read assembly [LR]) and single-molecule optical maps (optical map assembly [OM]). A three-way comparison including the published short-read assembly (SR) constructed for the same individual allowed assessing assembly properties and pinpointing misassemblies. By combining information from all three assemblies, we characterized 36 previously unidentified large repetitive regions in the proximity of sequence assembly breakpoints, the majority of which contained complex arrays of a 14-kb satellite repeat or its 1.2-kb subunit. Using whole-genome population resequencing data, we estimated the population-scaled recombination rate (ρ) and found it to be significantly reduced in these regions. These findings are consistent with an effect of low recombination in regions adjacent to centromeric or subtelomeric heterochromatin and add to our understanding of the processes generating widespread heterogeneity in genetic diversity and differentiation along the genome. By combining three different technologies, our results highlight the importance of adding a layer of information on genome structure that is inaccessible to each approach independently. Cold Spring Harbor Laboratory Press 2017-05 /pmc/articles/PMC5411765/ /pubmed/28360231 http://dx.doi.org/10.1101/gr.215095.116 Text en © 2017 Weissensteiner et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Research
Weissensteiner, Matthias H.
Pang, Andy W.C.
Bunikis, Ignas
Höijer, Ida
Vinnere-Petterson, Olga
Suh, Alexander
Wolf, Jochen B.W.
Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title_full Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title_fullStr Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title_full_unstemmed Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title_short Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
title_sort combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411765/
https://www.ncbi.nlm.nih.gov/pubmed/28360231
http://dx.doi.org/10.1101/gr.215095.116
work_keys_str_mv AT weissensteinermatthiash combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT pangandywc combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT bunikisignas combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT hoijerida combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT vinnerepettersonolga combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT suhalexander combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications
AT wolfjochenbw combinationofshortreadlongreadandopticalmappingassembliesrevealslargescaletandemrepeatarrayswithpopulationgeneticimplications