Cargando…

A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes

BACKGROUND: Sequencing the genomes of multiple, taxonomically diverse eukaryotes enables in-depth comparative-genomic analysis which is expected to help in reconstructing ancestral eukaryotic genomes and major events in eukaryotic evolution and in making functional predictions for currently uncharac...

Descripción completa

Detalles Bibliográficos
Autores principales: Koonin, Eugene V, Fedorova, Natalie D, Jackson, John D, Jacobs, Aviva R, Krylov, Dmitri M, Makarova, Kira S, Mazumder, Raja, Mekhedov, Sergei L, Nikolskaya, Anastasia N, Rao, B Sridhar, Rogozin, Igor B, Smirnov, Sergei, Sorokin, Alexander V, Sverdlov, Alexander V, Vasudevan, Sona, Wolf, Yuri I, Yin, Jodie J, Natale, Darren A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC395751/
https://www.ncbi.nlm.nih.gov/pubmed/14759257
_version_ 1782121326736048128
author Koonin, Eugene V
Fedorova, Natalie D
Jackson, John D
Jacobs, Aviva R
Krylov, Dmitri M
Makarova, Kira S
Mazumder, Raja
Mekhedov, Sergei L
Nikolskaya, Anastasia N
Rao, B Sridhar
Rogozin, Igor B
Smirnov, Sergei
Sorokin, Alexander V
Sverdlov, Alexander V
Vasudevan, Sona
Wolf, Yuri I
Yin, Jodie J
Natale, Darren A
author_facet Koonin, Eugene V
Fedorova, Natalie D
Jackson, John D
Jacobs, Aviva R
Krylov, Dmitri M
Makarova, Kira S
Mazumder, Raja
Mekhedov, Sergei L
Nikolskaya, Anastasia N
Rao, B Sridhar
Rogozin, Igor B
Smirnov, Sergei
Sorokin, Alexander V
Sverdlov, Alexander V
Vasudevan, Sona
Wolf, Yuri I
Yin, Jodie J
Natale, Darren A
author_sort Koonin, Eugene V
collection PubMed
description BACKGROUND: Sequencing the genomes of multiple, taxonomically diverse eukaryotes enables in-depth comparative-genomic analysis which is expected to help in reconstructing ancestral eukaryotic genomes and major events in eukaryotic evolution and in making functional predictions for currently uncharacterized conserved genes. RESULTS: We examined functional and evolutionary patterns in the recently constructed set of 5,873 clusters of predicted orthologs (eukaryotic orthologous groups or KOGs) from seven eukaryotic genomes: Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Encephalitozoon cuniculi. Conservation of KOGs through the phyletic range of eukaryotes strongly correlates with their functions and with the effect of gene knockout on the organism's viability. The approximately 40% of KOGs that are represented in six or seven species are enriched in proteins responsible for housekeeping functions, particularly translation and RNA processing. These conserved KOGs are often essential for survival and might approximate the minimal set of essential eukaryotic genes. The 131 single-member, pan-eukaryotic KOGs we identified were examined in detail. For around 20 that remained uncharacterized, functions were predicted by in-depth sequence analysis and examination of genomic context. Nearly all these proteins are subunits of known or predicted multiprotein complexes, in agreement with the balance hypothesis of evolution of gene copy number. Other KOGs show a variety of phyletic patterns, which points to major contributions of lineage-specific gene loss and the 'invention' of genes new to eukaryotic evolution. Examination of the sets of KOGs lost in individual lineages reveals co-elimination of functionally connected genes. Parsimonious scenarios of eukaryotic genome evolution and gene sets for ancestral eukaryotic forms were reconstructed. The gene set of the last common ancestor of the crown group consists of 3,413 KOGs and largely includes proteins involved in genome replication and expression, and central metabolism. Only 44% of the KOGs, mostly from the reconstructed gene set of the last common ancestor of the crown group, have detectable homologs in prokaryotes; the remainder apparently evolved via duplication with divergence and invention of new genes. CONCLUSIONS: The KOG analysis reveals a conserved core of largely essential eukaryotic genes as well as major diversification and innovation associated with evolution of eukaryotic genomes. The results provide quantitative support for major trends of eukaryotic evolution noticed previously at the qualitative level and a basis for detailed reconstruction of evolution of eukaryotic genomes and biology of ancestral forms.
format Text
id pubmed-395751
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-3957512004-04-24 A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes Koonin, Eugene V Fedorova, Natalie D Jackson, John D Jacobs, Aviva R Krylov, Dmitri M Makarova, Kira S Mazumder, Raja Mekhedov, Sergei L Nikolskaya, Anastasia N Rao, B Sridhar Rogozin, Igor B Smirnov, Sergei Sorokin, Alexander V Sverdlov, Alexander V Vasudevan, Sona Wolf, Yuri I Yin, Jodie J Natale, Darren A Genome Biol Research BACKGROUND: Sequencing the genomes of multiple, taxonomically diverse eukaryotes enables in-depth comparative-genomic analysis which is expected to help in reconstructing ancestral eukaryotic genomes and major events in eukaryotic evolution and in making functional predictions for currently uncharacterized conserved genes. RESULTS: We examined functional and evolutionary patterns in the recently constructed set of 5,873 clusters of predicted orthologs (eukaryotic orthologous groups or KOGs) from seven eukaryotic genomes: Caenorhabditis elegans, Drosophila melanogaster, Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Encephalitozoon cuniculi. Conservation of KOGs through the phyletic range of eukaryotes strongly correlates with their functions and with the effect of gene knockout on the organism's viability. The approximately 40% of KOGs that are represented in six or seven species are enriched in proteins responsible for housekeeping functions, particularly translation and RNA processing. These conserved KOGs are often essential for survival and might approximate the minimal set of essential eukaryotic genes. The 131 single-member, pan-eukaryotic KOGs we identified were examined in detail. For around 20 that remained uncharacterized, functions were predicted by in-depth sequence analysis and examination of genomic context. Nearly all these proteins are subunits of known or predicted multiprotein complexes, in agreement with the balance hypothesis of evolution of gene copy number. Other KOGs show a variety of phyletic patterns, which points to major contributions of lineage-specific gene loss and the 'invention' of genes new to eukaryotic evolution. Examination of the sets of KOGs lost in individual lineages reveals co-elimination of functionally connected genes. Parsimonious scenarios of eukaryotic genome evolution and gene sets for ancestral eukaryotic forms were reconstructed. The gene set of the last common ancestor of the crown group consists of 3,413 KOGs and largely includes proteins involved in genome replication and expression, and central metabolism. Only 44% of the KOGs, mostly from the reconstructed gene set of the last common ancestor of the crown group, have detectable homologs in prokaryotes; the remainder apparently evolved via duplication with divergence and invention of new genes. CONCLUSIONS: The KOG analysis reveals a conserved core of largely essential eukaryotic genes as well as major diversification and innovation associated with evolution of eukaryotic genomes. The results provide quantitative support for major trends of eukaryotic evolution noticed previously at the qualitative level and a basis for detailed reconstruction of evolution of eukaryotic genomes and biology of ancestral forms. BioMed Central 2004 2004-01-15 /pmc/articles/PMC395751/ /pubmed/14759257 Text en Copyright © 2004 Koonin et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research
Koonin, Eugene V
Fedorova, Natalie D
Jackson, John D
Jacobs, Aviva R
Krylov, Dmitri M
Makarova, Kira S
Mazumder, Raja
Mekhedov, Sergei L
Nikolskaya, Anastasia N
Rao, B Sridhar
Rogozin, Igor B
Smirnov, Sergei
Sorokin, Alexander V
Sverdlov, Alexander V
Vasudevan, Sona
Wolf, Yuri I
Yin, Jodie J
Natale, Darren A
A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title_full A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title_fullStr A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title_full_unstemmed A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title_short A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
title_sort comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC395751/
https://www.ncbi.nlm.nih.gov/pubmed/14759257
work_keys_str_mv AT koonineugenev acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT fedorovanatalied acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT jacksonjohnd acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT jacobsavivar acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT krylovdmitrim acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT makarovakiras acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT mazumderraja acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT mekhedovsergeil acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT nikolskayaanastasian acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT raobsridhar acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT rogozinigorb acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT smirnovsergei acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT sorokinalexanderv acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT sverdlovalexanderv acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT vasudevansona acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT wolfyurii acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT yinjodiej acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT nataledarrena acomprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT koonineugenev comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT fedorovanatalied comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT jacksonjohnd comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT jacobsavivar comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT krylovdmitrim comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT makarovakiras comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT mazumderraja comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT mekhedovsergeil comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT nikolskayaanastasian comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT raobsridhar comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT rogozinigorb comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT smirnovsergei comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT sorokinalexanderv comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT sverdlovalexanderv comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT vasudevansona comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT wolfyurii comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT yinjodiej comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes
AT nataledarrena comprehensiveevolutionaryclassificationofproteinsencodedincompleteeukaryoticgenomes