Cargando…

ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files

BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of prote...

Descripción completa

Detalles Bibliográficos
Autores principales:	Büssow, Konrad, Hoffmann, Steve, Sievert, Volker
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2002
Materias:	Methodology article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC139979/ https://www.ncbi.nlm.nih.gov/pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40

_version_	1782120581707071488
author	Büssow, Konrad Hoffmann, Steve Sievert, Volker
author_facet	Büssow, Konrad Hoffmann, Steve Sievert, Volker
author_sort	Büssow, Konrad
collection	PubMed
description	BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION: The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
format	Text
id	pubmed-139979
institution	National Center for Biotechnology Information
language	English
publishDate	2002
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-1399792003-01-16 ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files Büssow, Konrad Hoffmann, Steve Sievert, Volker BMC Bioinformatics Methodology article BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION: The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information. BioMed Central 2002-12-19 /pmc/articles/PMC139979/ /pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40 Text en Copyright ©2002 Büssow et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle	Methodology article Büssow, Konrad Hoffmann, Steve Sievert, Volker ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title	ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title_full	ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title_fullStr	ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title_full_unstemmed	ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title_short	ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
title_sort	orfer – retrieval of protein sequences and open reading frames from genbank and storage into relational databases or text files
topic	Methodology article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC139979/ https://www.ncbi.nlm.nih.gov/pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40
work_keys_str_mv	AT bussowkonrad orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles AT hoffmannsteve orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles AT sievertvolker orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles

ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files

Ejemplares similares