Cargando…

Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus

Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a...

Descripción completa

Detalles Bibliográficos
Autores principales: Vukašinović, Nemanja, Cvrčková, Fatima, Eliáš, Marek, Cole, Rex, Fowler, John E., Žárský, Viktor, Synek, Lukáš
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984084/
https://www.ncbi.nlm.nih.gov/pubmed/24728280
http://dx.doi.org/10.1371/journal.pone.0094077
_version_ 1782311394977251328
author Vukašinović, Nemanja
Cvrčková, Fatima
Eliáš, Marek
Cole, Rex
Fowler, John E.
Žárský, Viktor
Synek, Lukáš
author_facet Vukašinović, Nemanja
Cvrčková, Fatima
Eliáš, Marek
Cole, Rex
Fowler, John E.
Žárský, Viktor
Synek, Lukáš
author_sort Vukašinović, Nemanja
collection PubMed
description Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.
format Online
Article
Text
id pubmed-3984084
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39840842014-04-15 Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus Vukašinović, Nemanja Cvrčková, Fatima Eliáš, Marek Cole, Rex Fowler, John E. Žárský, Viktor Synek, Lukáš PLoS One Research Article Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation. Public Library of Science 2014-04-11 /pmc/articles/PMC3984084/ /pubmed/24728280 http://dx.doi.org/10.1371/journal.pone.0094077 Text en © 2014 Vukašinović et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Vukašinović, Nemanja
Cvrčková, Fatima
Eliáš, Marek
Cole, Rex
Fowler, John E.
Žárský, Viktor
Synek, Lukáš
Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title_full Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title_fullStr Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title_full_unstemmed Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title_short Dissecting a Hidden Gene Duplication: The Arabidopsis thaliana SEC10 Locus
title_sort dissecting a hidden gene duplication: the arabidopsis thaliana sec10 locus
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3984084/
https://www.ncbi.nlm.nih.gov/pubmed/24728280
http://dx.doi.org/10.1371/journal.pone.0094077
work_keys_str_mv AT vukasinovicnemanja dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT cvrckovafatima dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT eliasmarek dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT colerex dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT fowlerjohne dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT zarskyviktor dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT syneklukas dissectingahiddengeneduplicationthearabidopsisthalianasec10locus