Cargando…

Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location

BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic...

Descripción completa

Detalles Bibliográficos
Autores principales: Zmienko, Agnieszka, Samelak-Czajka, Anna, Kozlowski, Piotr, Szymanska, Maja, Figlerowicz, Marek
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5101643/
https://www.ncbi.nlm.nih.gov/pubmed/27825302
http://dx.doi.org/10.1186/s12864-016-3221-1
_version_ 1782466318365097984
author Zmienko, Agnieszka
Samelak-Czajka, Anna
Kozlowski, Piotr
Szymanska, Maja
Figlerowicz, Marek
author_facet Zmienko, Agnieszka
Samelak-Czajka, Anna
Kozlowski, Piotr
Szymanska, Maja
Figlerowicz, Marek
author_sort Zmienko, Agnieszka
collection PubMed
description BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. RESULTS: We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2–14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. CONCLUSIONS: We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3221-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5101643
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51016432016-11-10 Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location Zmienko, Agnieszka Samelak-Czajka, Anna Kozlowski, Piotr Szymanska, Maja Figlerowicz, Marek BMC Genomics Research Article BACKGROUND: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. RESULTS: We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2–14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. CONCLUSIONS: We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3221-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-08 /pmc/articles/PMC5101643/ /pubmed/27825302 http://dx.doi.org/10.1186/s12864-016-3221-1 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zmienko, Agnieszka
Samelak-Czajka, Anna
Kozlowski, Piotr
Szymanska, Maja
Figlerowicz, Marek
Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title_full Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title_fullStr Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title_full_unstemmed Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title_short Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location
title_sort arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning msh2, at3g18530 and at3g18535 genes and provides evidence for nahr-driven recurrent cnv events occurring in this location
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5101643/
https://www.ncbi.nlm.nih.gov/pubmed/27825302
http://dx.doi.org/10.1186/s12864-016-3221-1
work_keys_str_mv AT zmienkoagnieszka arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation
AT samelakczajkaanna arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation
AT kozlowskipiotr arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation
AT szymanskamaja arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation
AT figlerowiczmarek arabidopsisthalianapopulationanalysisrevealshighplasticityofthegenomicregionspanningmsh2at3g18530andat3g18535genesandprovidesevidencefornahrdrivenrecurrentcnveventsoccurringinthislocation