Cargando…

Genome SEGE: A database for 'intronless' genes in eukaryotic genomes

BACKGROUND: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary stud...

Descripción completa

Detalles Bibliográficos
Autores principales: Sakharkar, Meena Kishore, Kangueane, Pandjassarame
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC434494/
https://www.ncbi.nlm.nih.gov/pubmed/15175116
http://dx.doi.org/10.1186/1471-2105-5-67
_version_ 1782121516989677568
author Sakharkar, Meena Kishore
Kangueane, Pandjassarame
author_facet Sakharkar, Meena Kishore
Kangueane, Pandjassarame
author_sort Sakharkar, Meena Kishore
collection PubMed
description BACKGROUND: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed. Availability: DESCRIPTION: Eukaryotic 'intronless' genes are extracted from nine completely sequenced genomes (four of which are unicellular and five of which are multi-cellular). The complete dataset is available for download. Data subsets are also available for 'intronless' pseudo-genes. The database provides information on the distribution of 'intronless' genes in different genomes together with their length distributions in each genome. Additionally, the search tool provides pre-computed PROSITE motifs for each sequence in the database with appropriate hyperlinks to InterPro. A search facility is also available through the web server. CONCLUSIONS: The unique features that distinguish Genome SEGE from SEGE is the service providing representative 'intronless' datasets for completely sequenced genomes. 'Intronless' gene sets available in this database will be of use for subsequent bio-computational analysis in comparative genomics and evolutionary studies. Such analysis may help to revisit the original genome data for re-examination and re-annotation.
format Text
id pubmed-434494
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4344942004-06-25 Genome SEGE: A database for 'intronless' genes in eukaryotic genomes Sakharkar, Meena Kishore Kangueane, Pandjassarame BMC Bioinformatics Database BACKGROUND: A number of completely sequenced eukaryotic genome data are available in the public domain. Eukaryotic genes are either 'intron containing' or 'intronless'. Eukaryotic 'intronless' genes are interesting datasets for comparative genomics and evolutionary studies. The SEGE database containing a collection of eukaryotic single exon genes is available. However, SEGE is derived using GenBank. The redundant, incomplete and heterogeneous qualities of GenBank data are a bottleneck for biological investigation in comparative genomics and evolutionary studies. Such studies often require representative gene sets from each genome and this is possible only by deriving specific datasets from completely sequenced genome data. Thus Genome SEGE, a database for 'intronless' genes in completely sequenced eukaryotic genomes, has been constructed. Availability: DESCRIPTION: Eukaryotic 'intronless' genes are extracted from nine completely sequenced genomes (four of which are unicellular and five of which are multi-cellular). The complete dataset is available for download. Data subsets are also available for 'intronless' pseudo-genes. The database provides information on the distribution of 'intronless' genes in different genomes together with their length distributions in each genome. Additionally, the search tool provides pre-computed PROSITE motifs for each sequence in the database with appropriate hyperlinks to InterPro. A search facility is also available through the web server. CONCLUSIONS: The unique features that distinguish Genome SEGE from SEGE is the service providing representative 'intronless' datasets for completely sequenced genomes. 'Intronless' gene sets available in this database will be of use for subsequent bio-computational analysis in comparative genomics and evolutionary studies. Such analysis may help to revisit the original genome data for re-examination and re-annotation. BioMed Central 2004-06-02 /pmc/articles/PMC434494/ /pubmed/15175116 http://dx.doi.org/10.1186/1471-2105-5-67 Text en Copyright © 2004 Sakharkar and Kangueane; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Database
Sakharkar, Meena Kishore
Kangueane, Pandjassarame
Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title_full Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title_fullStr Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title_full_unstemmed Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title_short Genome SEGE: A database for 'intronless' genes in eukaryotic genomes
title_sort genome sege: a database for 'intronless' genes in eukaryotic genomes
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC434494/
https://www.ncbi.nlm.nih.gov/pubmed/15175116
http://dx.doi.org/10.1186/1471-2105-5-67
work_keys_str_mv AT sakharkarmeenakishore genomesegeadatabaseforintronlessgenesineukaryoticgenomes
AT kangueanepandjassarame genomesegeadatabaseforintronlessgenesineukaryoticgenomes