Cargando…
Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5101643/ https://www.ncbi.nlm.nih.gov/pubmed/27825302 http://dx.doi.org/10.1186/s12864-016-3221-1 |
_version_ | 1782466318365097984 |
---|---|
author | Zmienko, Agnieszka Samelak-Czajka, Anna Kozlowski, Piotr Szymanska, Maja Figlerowicz, Marek |
author_facet | Zmienko, Agnieszka Samelak-Czajka, Anna Kozlowski, Piotr Szymanska, Maja Figlerowicz, Marek |
author_sort | Zmienko, Agnieszka |
collection | PubMed |
description | BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. RESULTS: We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2–14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. CONCLUSIONS: We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3221-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5101643 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-51016432016-11-10 Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location Zmienko, Agnieszka Samelak-Czajka, Anna Kozlowski, Piotr Szymanska, Maja Figlerowicz, Marek BMC Genomics Research Article BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. RESULTS: We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2–14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. CONCLUSIONS: We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3221-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-08 /pmc/articles/PMC5101643/ /pubmed/27825302 http://dx.doi.org/10.1186/s12864-016-3221-1 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Zmienko, Agnieszka Samelak-Czajka, Anna Kozlowski, Piotr Szymanska, Maja Figlerowicz, Marek Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title | Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title_full | Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title_fullStr | Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title_full_unstemmed | Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title_short | Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location |
title_sort | arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning msh2, at3g18530 and at3g18535 genes and provides evidence for nahr-driven recurrent cnv events occurring in this location |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5101643/ https://www.ncbi.nlm.nih.gov/pubmed/27825302 http://dx.doi.org/10.1186/s12864-016-3221-1 |
work_keys_str_mv | AT zmienkoagnieszka arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation AT samelakczajkaanna arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation AT kozlowskipiotr arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation AT szymanskamaja arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation AT figlerowiczmarek arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation |