Cargando…

Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes

BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Intei...

Descripción completa

Detalles Bibliográficos
Autores principales: Goodwin, Timothy JD, Butler, Margaret I, Poulter, Russell TM
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635734/
https://www.ncbi.nlm.nih.gov/pubmed/17069655
http://dx.doi.org/10.1186/1741-7007-4-38
_version_ 1782130713113395200
author Goodwin, Timothy JD
Butler, Margaret I
Poulter, Russell TM
author_facet Goodwin, Timothy JD
Butler, Margaret I
Poulter, Russell TM
author_sort Goodwin, Timothy JD
collection PubMed
description BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. RESULTS: We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2) from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2) of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2), one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. CONCLUSION: The identification of these new inteins increases the known host range of intein sequences in eukaryotes, and provides fresh insights into their origins and evolution. We conclude that inteins are ancient eukaryote elements once found widely among microbial eukaryotes. They persist as rarities in the genomes of a sporadic array of microorganisms, occupying highly conserved sites in diverse proteins.
format Text
id pubmed-1635734
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16357342006-11-11 Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes Goodwin, Timothy JD Butler, Margaret I Poulter, Russell TM BMC Biol Research Article BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. RESULTS: We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2) from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2) of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2), one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. CONCLUSION: The identification of these new inteins increases the known host range of intein sequences in eukaryotes, and provides fresh insights into their origins and evolution. We conclude that inteins are ancient eukaryote elements once found widely among microbial eukaryotes. They persist as rarities in the genomes of a sporadic array of microorganisms, occupying highly conserved sites in diverse proteins. BioMed Central 2006-10-27 /pmc/articles/PMC1635734/ /pubmed/17069655 http://dx.doi.org/10.1186/1741-7007-4-38 Text en Copyright © 2006 Goodwin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Goodwin, Timothy JD
Butler, Margaret I
Poulter, Russell TM
Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title_full Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title_fullStr Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title_full_unstemmed Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title_short Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
title_sort multiple, non-allelic, intein-coding sequences in eukaryotic rna polymerase genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635734/
https://www.ncbi.nlm.nih.gov/pubmed/17069655
http://dx.doi.org/10.1186/1741-7007-4-38
work_keys_str_mv AT goodwintimothyjd multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes
AT butlermargareti multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes
AT poulterrusselltm multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes