Cargando…

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels

BACKGROUND: Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as ...

Descripción completa

Detalles Bibliográficos
Autores principales: Duarte, Jill M, Wall, P Kerr, Edger, Patrick P, Landherr, Lena L, Ma, Hong, Pires, J Chris, Leebens-Mack, Jim, dePamphilis, Claude W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2848037/
https://www.ncbi.nlm.nih.gov/pubmed/20181251
http://dx.doi.org/10.1186/1471-2148-10-61
_version_ 1782179636580450304
author Duarte, Jill M
Wall, P Kerr
Edger, Patrick P
Landherr, Lena L
Ma, Hong
Pires, J Chris
Leebens-Mack, Jim
dePamphilis, Claude W
author_facet Duarte, Jill M
Wall, P Kerr
Edger, Patrick P
Landherr, Lena L
Ma, Hong
Pires, J Chris
Leebens-Mack, Jim
dePamphilis, Claude W
author_sort Duarte, Jill M
collection PubMed
description BACKGROUND: Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as Selaginella and Physcomitrella, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and de novo amplification via RT-PCR in the family Brassicaceae. RESULTS: There are 959 single copy nuclear genes shared in Arabidopsis, Populus, Vitis and Oryza ["APVO SSC genes"]. The majority of these genes are also present in the Selaginella and Physcomitrella genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown. Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes. CONCLUSIONS: Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.
format Text
id pubmed-2848037
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28480372010-04-01 Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels Duarte, Jill M Wall, P Kerr Edger, Patrick P Landherr, Lena L Ma, Hong Pires, J Chris Leebens-Mack, Jim dePamphilis, Claude W BMC Evol Biol Research article BACKGROUND: Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as Selaginella and Physcomitrella, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and de novo amplification via RT-PCR in the family Brassicaceae. RESULTS: There are 959 single copy nuclear genes shared in Arabidopsis, Populus, Vitis and Oryza ["APVO SSC genes"]. The majority of these genes are also present in the Selaginella and Physcomitrella genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown. Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes. CONCLUSIONS: Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels. BioMed Central 2010-02-24 /pmc/articles/PMC2848037/ /pubmed/20181251 http://dx.doi.org/10.1186/1471-2148-10-61 Text en Copyright ©2010 Duarte et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Duarte, Jill M
Wall, P Kerr
Edger, Patrick P
Landherr, Lena L
Ma, Hong
Pires, J Chris
Leebens-Mack, Jim
dePamphilis, Claude W
Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title_full Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title_fullStr Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title_full_unstemmed Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title_short Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels
title_sort identification of shared single copy nuclear genes in arabidopsis, populus, vitis and oryza and their phylogenetic utility across various taxonomic levels
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2848037/
https://www.ncbi.nlm.nih.gov/pubmed/20181251
http://dx.doi.org/10.1186/1471-2148-10-61
work_keys_str_mv AT duartejillm identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT wallpkerr identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT edgerpatrickp identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT landherrlenal identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT mahong identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT piresjchris identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT leebensmackjim identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels
AT depamphilisclaudew identificationofsharedsinglecopynucleargenesinarabidopsispopulusvitisandoryzaandtheirphylogeneticutilityacrossvarioustaxonomiclevels