Cargando…
Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes
BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent ge...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5369179/ https://www.ncbi.nlm.nih.gov/pubmed/28347275 http://dx.doi.org/10.1186/s12864-017-3654-1 |
_version_ | 1782518081371766784 |
---|---|
author | Zhou, Peng Silverstein, Kevin A. T. Ramaraj, Thiruvarangan Guhlin, Joseph Denny, Roxanne Liu, Junqi Farmer, Andrew D. Steele, Kelly P. Stupar, Robert M. Miller, Jason R. Tiffin, Peter Mudge, Joann Young, Nevin D. |
author_facet | Zhou, Peng Silverstein, Kevin A. T. Ramaraj, Thiruvarangan Guhlin, Joseph Denny, Roxanne Liu, Junqi Farmer, Andrew D. Steele, Kelly P. Stupar, Robert M. Miller, Jason R. Tiffin, Peter Mudge, Joann Young, Nevin D. |
author_sort | Zhou, Peng |
collection | PubMed |
description | BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. RESULTS: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. CONCLUSIONS: Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3654-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5369179 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-53691792017-03-30 Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes Zhou, Peng Silverstein, Kevin A. T. Ramaraj, Thiruvarangan Guhlin, Joseph Denny, Roxanne Liu, Junqi Farmer, Andrew D. Steele, Kelly P. Stupar, Robert M. Miller, Jason R. Tiffin, Peter Mudge, Joann Young, Nevin D. BMC Genomics Research Article BACKGROUND: Previous studies exploring sequence variation in the model legume, Medicago truncatula, relied on mapping short reads to a single reference. However, read-mapping approaches are inadequate to examine large, diverse gene families or to probe variation in repeat-rich or highly divergent genome regions. De novo sequencing and assembly of M. truncatula genomes enables near-comprehensive discovery of structural variants (SVs), analysis of rapidly evolving gene families, and ultimately, construction of a pan-genome. RESULTS: Genome-wide synteny based on 15 de novo M. truncatula assemblies effectively detected different types of SVs indicating that as much as 22% of the genome is involved in large structural changes, altogether affecting 28% of gene models. A total of 63 million base pairs (Mbp) of novel sequence was discovered, expanding the reference genome space for Medicago by 16%. Pan-genome analysis revealed that 42% (180 Mbp) of genomic sequences is missing in one or more accession, while examination of de novo annotated genes identified 67% (50,700) of all ortholog groups as dispensable – estimates comparable to recent studies in rice, maize and soybean. Rapidly evolving gene families typically associated with biotic interactions and stress response were found to be enriched in the accession-specific gene pool. The nucleotide-binding site leucine-rich repeat (NBS-LRR) family, in particular, harbors the highest level of nucleotide diversity, large effect single nucleotide change, protein diversity, and presence/absence variation. However, the leucine-rich repeat (LRR) and heat shock gene families are disproportionately affected by large effect single nucleotide changes and even higher levels of copy number variation. CONCLUSIONS: Analysis of multiple M. truncatula genomes illustrates the value of de novo assemblies to discover and describe structural variation, something that is often under-estimated when using read-mapping approaches. Comparisons among the de novo assemblies also indicate that different large gene families differ in the architecture of their structural variation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3654-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-03-27 /pmc/articles/PMC5369179/ /pubmed/28347275 http://dx.doi.org/10.1186/s12864-017-3654-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Zhou, Peng Silverstein, Kevin A. T. Ramaraj, Thiruvarangan Guhlin, Joseph Denny, Roxanne Liu, Junqi Farmer, Andrew D. Steele, Kelly P. Stupar, Robert M. Miller, Jason R. Tiffin, Peter Mudge, Joann Young, Nevin D. Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title | Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title_full | Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title_fullStr | Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title_full_unstemmed | Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title_short | Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes |
title_sort | exploring structural variation and gene family architecture with de novo assemblies of 15 medicago genomes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5369179/ https://www.ncbi.nlm.nih.gov/pubmed/28347275 http://dx.doi.org/10.1186/s12864-017-3654-1 |
work_keys_str_mv | AT zhoupeng exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT silversteinkevinat exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT ramarajthiruvarangan exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT guhlinjoseph exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT dennyroxanne exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT liujunqi exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT farmerandrewd exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT steelekellyp exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT stuparrobertm exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT millerjasonr exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT tiffinpeter exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT mudgejoann exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes AT youngnevind exploringstructuralvariationandgenefamilyarchitecturewithdenovoassembliesof15medicagogenomes |