Cargando…
ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files
BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of prote...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2002
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC139979/ https://www.ncbi.nlm.nih.gov/pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40 |
_version_ | 1782120581707071488 |
---|---|
author | Büssow, Konrad Hoffmann, Steve Sievert, Volker |
author_facet | Büssow, Konrad Hoffmann, Steve Sievert, Volker |
author_sort | Büssow, Konrad |
collection | PubMed |
description | BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION: The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information. |
format | Text |
id | pubmed-139979 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2002 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-1399792003-01-16 ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files Büssow, Konrad Hoffmann, Steve Sievert, Volker BMC Bioinformatics Methodology article BACKGROUND: Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. RESULTS: A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. CONCLUSION: The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information. BioMed Central 2002-12-19 /pmc/articles/PMC139979/ /pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40 Text en Copyright ©2002 Büssow et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. |
spellingShingle | Methodology article Büssow, Konrad Hoffmann, Steve Sievert, Volker ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title | ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title_full | ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title_fullStr | ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title_full_unstemmed | ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title_short | ORFer – retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files |
title_sort | orfer – retrieval of protein sequences and open reading frames from genbank and storage into relational databases or text files |
topic | Methodology article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC139979/ https://www.ncbi.nlm.nih.gov/pubmed/12493080 http://dx.doi.org/10.1186/1471-2105-3-40 |
work_keys_str_mv | AT bussowkonrad orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles AT hoffmannsteve orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles AT sievertvolker orferretrievalofproteinsequencesandopenreadingframesfromgenbankandstorageintorelationaldatabasesortextfiles |