Cargando…

Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes

BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent ge...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Peng, Silverstein, Kevin A. T., Ramaraj, Thiruvarangan, Guhlin, Joseph, Denny, Roxanne, Liu, Junqi, Farmer, Andrew D., Steele, Kelly P., Stupar, Robert M., Miller, Jason R., Tiffin, Peter, Mudge, Joann, Young, Nevin D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5369179/
https://www.ncbi.nlm.nih.gov/pubmed/28347275
http://dx.doi.org/10.1186/s12864-017-3654-1
_version_ 1782518081371766784
author Zhou, Peng
Silverstein, Kevin A. T.
Ramaraj, Thiruvarangan
Guhlin, Joseph
Denny, Roxanne
Liu, Junqi
Farmer, Andrew D.
Steele, Kelly P.
Stupar, Robert M.
Miller, Jason R.
Tiffin, Peter
Mudge, Joann
Young, Nevin D.
author_facet Zhou, Peng
Silverstein, Kevin A. T.
Ramaraj, Thiruvarangan
Guhlin, Joseph
Denny, Roxanne
Liu, Junqi
Farmer, Andrew D.
Steele, Kelly P.
Stupar, Robert M.
Miller, Jason R.
Tiffin, Peter
Mudge, Joann
Young, Nevin D.
author_sort Zhou, Peng
collection PubMed
description BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. RESULTS: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. CONCLUSIONS: Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3654-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5369179
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53691792017-03-30 Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes Zhou, Peng Silverstein, Kevin A. T. Ramaraj, Thiruvarangan Guhlin, Joseph Denny, Roxanne Liu, Junqi Farmer, Andrew D. Steele, Kelly P. Stupar, Robert M. Miller, Jason R. Tiffin, Peter Mudge, Joann Young, Nevin D. BMC Genomics Research Article BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. RESULTS: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. CONCLUSIONS: Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3654-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-27 /pmc/articles/PMC5369179/ /pubmed/28347275 http://dx.doi.org/10.1186/s12864-017-3654-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zhou, Peng
Silverstein, Kevin A. T.
Ramaraj, Thiruvarangan
Guhlin, Joseph
Denny, Roxanne
Liu, Junqi
Farmer, Andrew D.
Steele, Kelly P.
Stupar, Robert M.
Miller, Jason R.
Tiffin, Peter
Mudge, Joann
Young, Nevin D.
Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title_full Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title_fullStr Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title_full_unstemmed Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title_short Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
title_sort exploring structural variation and gene family architecture with de novo assemblies of 15 medicago genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5369179/
https://www.ncbi.nlm.nih.gov/pubmed/28347275
http://dx.doi.org/10.1186/s12864-017-3654-1
work_keys_str_mv AT zhoupeng exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT silversteinkevinat exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT ramarajthiruvarangan exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT guhlinjoseph exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT dennyroxanne exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT liujunqi exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT farmerandrewd exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT steelekellyp exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT stuparrobertm exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT millerjasonr exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT tiffinpeter exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT mudgejoann exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes
AT youngnevind exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes