Cargando…

XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis

BACKGROUND: Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available...

Descripción completa

Detalles Bibliográficos
Autores principales: Sczyrba, Alexander, Beckstette, Michael, Brivanlou, Ali H, Giegerich, Robert, Altmann, Curtis R
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261260/
https://www.ncbi.nlm.nih.gov/pubmed/16162280
http://dx.doi.org/10.1186/1471-2164-6-123
_version_ 1782125868817055744
author Sczyrba, Alexander
Beckstette, Michael
Brivanlou, Ali H
Giegerich, Robert
Altmann, Curtis R
author_facet Sczyrba, Alexander
Beckstette, Michael
Brivanlou, Ali H
Giegerich, Robert
Altmann, Curtis R
author_sort Sczyrba, Alexander
collection PubMed
description BACKGROUND: Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available sequence, we have analyzed 350,468 Xenopus laevis Expressed Sequence Tags (ESTs) both to identify full length protein encoding sequences and to develop a unique database system to support comparative approaches between X. laevis and other model systems. DESCRIPTION: Using a suffix array based clustering approach, we have identified 25,971 clusters and 40,877 singleton sequences. Generation of a consensus sequence for each cluster resulted in 31,353 tentative contig and 4,801 singleton sequences. Using both BLASTX and FASTY comparison to five model organisms and the NR protein database, more than 15,000 sequences are predicted to encode full length proteins and these have been matched to publicly available IMAGE clones when available. Each sequence has been compared to the KOG database and ~67% of the sequences have been assigned a putative functional category. Based on sequence homology to mouse and human, putative GO annotations have been determined. CONCLUSION: The results of the analysis have been stored in a publicly available database XenDB . A unique capability of the database is the ability to batch upload cross species queries to identify potential Xenopus homologues and their associated full length clones. Examples are provided including mapping of microarray results and application of 'in silico' analysis. The ability to quickly translate the results of various species into 'Xenopus-centric' information should greatly enhance comparative embryological approaches. Supplementary material can be found at .
format Text
id pubmed-1261260
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-12612602005-10-22 XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis Sczyrba, Alexander Beckstette, Michael Brivanlou, Ali H Giegerich, Robert Altmann, Curtis R BMC Genomics Database BACKGROUND: Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available sequence, we have analyzed 350,468 Xenopus laevis Expressed Sequence Tags (ESTs) both to identify full length protein encoding sequences and to develop a unique database system to support comparative approaches between X. laevis and other model systems. DESCRIPTION: Using a suffix array based clustering approach, we have identified 25,971 clusters and 40,877 singleton sequences. Generation of a consensus sequence for each cluster resulted in 31,353 tentative contig and 4,801 singleton sequences. Using both BLASTX and FASTY comparison to five model organisms and the NR protein database, more than 15,000 sequences are predicted to encode full length proteins and these have been matched to publicly available IMAGE clones when available. Each sequence has been compared to the KOG database and ~67% of the sequences have been assigned a putative functional category. Based on sequence homology to mouse and human, putative GO annotations have been determined. CONCLUSION: The results of the analysis have been stored in a publicly available database XenDB . A unique capability of the database is the ability to batch upload cross species queries to identify potential Xenopus homologues and their associated full length clones. Examples are provided including mapping of microarray results and application of 'in silico' analysis. The ability to quickly translate the results of various species into 'Xenopus-centric' information should greatly enhance comparative embryological approaches. Supplementary material can be found at . BioMed Central 2005-09-14 /pmc/articles/PMC1261260/ /pubmed/16162280 http://dx.doi.org/10.1186/1471-2164-6-123 Text en Copyright © 2005 Sczyrba et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Sczyrba, Alexander
Beckstette, Michael
Brivanlou, Ali H
Giegerich, Robert
Altmann, Curtis R
XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title_full XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title_fullStr XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title_full_unstemmed XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title_short XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis
title_sort xendb: full length cdna prediction and cross species mapping in xenopus laevis
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1261260/
https://www.ncbi.nlm.nih.gov/pubmed/16162280
http://dx.doi.org/10.1186/1471-2164-6-123
work_keys_str_mv AT sczyrbaalexander xendbfulllengthcdnapredictionandcrossspeciesmappinginxenopuslaevis
AT beckstettemichael xendbfulllengthcdnapredictionandcrossspeciesmappinginxenopuslaevis
AT brivanloualih xendbfulllengthcdnapredictionandcrossspeciesmappinginxenopuslaevis
AT giegerichrobert xendbfulllengthcdnapredictionandcrossspeciesmappinginxenopuslaevis
AT altmanncurtisr xendbfulllengthcdnapredictionandcrossspeciesmappinginxenopuslaevis