Cargando…
Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes
BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Intei...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635734/ https://www.ncbi.nlm.nih.gov/pubmed/17069655 http://dx.doi.org/10.1186/1741-7007-4-38 |
_version_ | 1782130713113395200 |
---|---|
author | Goodwin, Timothy JD Butler, Margaret I Poulter, Russell TM |
author_facet | Goodwin, Timothy JD Butler, Margaret I Poulter, Russell TM |
author_sort | Goodwin, Timothy JD |
collection | PubMed |
description | BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. RESULTS: We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2) from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2) of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2), one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. CONCLUSION: The identification of these new inteins increases the known host range of intein sequences in eukaryotes, and provides fresh insights into their origins and evolution. We conclude that inteins are ancient eukaryote elements once found widely among microbial eukaryotes. They persist as rarities in the genomes of a sporadic array of microorganisms, occupying highly conserved sites in diverse proteins. |
format | Text |
id | pubmed-1635734 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-16357342006-11-11 Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes Goodwin, Timothy JD Butler, Margaret I Poulter, Russell TM BMC Biol Research Article BACKGROUND: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. RESULTS: We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2) from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2) of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2), one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthora ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. CONCLUSION: The identification of these new inteins increases the known host range of intein sequences in eukaryotes, and provides fresh insights into their origins and evolution. We conclude that inteins are ancient eukaryote elements once found widely among microbial eukaryotes. They persist as rarities in the genomes of a sporadic array of microorganisms, occupying highly conserved sites in diverse proteins. BioMed Central 2006-10-27 /pmc/articles/PMC1635734/ /pubmed/17069655 http://dx.doi.org/10.1186/1741-7007-4-38 Text en Copyright © 2006 Goodwin et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Goodwin, Timothy JD Butler, Margaret I Poulter, Russell TM Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title | Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title_full | Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title_fullStr | Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title_full_unstemmed | Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title_short | Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes |
title_sort | multiple, non-allelic, intein-coding sequences in eukaryotic rna polymerase genes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635734/ https://www.ncbi.nlm.nih.gov/pubmed/17069655 http://dx.doi.org/10.1186/1741-7007-4-38 |
work_keys_str_mv | AT goodwintimothyjd multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes AT butlermargareti multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes AT poulterrusselltm multiplenonallelicinteincodingsequencesineukaryoticrnapolymerasegenes |