Cargando…

gff2sequence, a new user friendly tool for the generation of genomic sequences

BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been develop...

Descripción completa

Detalles Bibliográficos
Autores principales: Camiolo, Salvatore, Porceddu, Andrea
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848729/
https://www.ncbi.nlm.nih.gov/pubmed/24020993
http://dx.doi.org/10.1186/1756-0381-6-15
_version_ 1782293808523771904
author Camiolo, Salvatore
Porceddu, Andrea
author_facet Camiolo, Salvatore
Porceddu, Andrea
author_sort Camiolo, Salvatore
collection PubMed
description BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been developed to extract the corresponding sequence information from the original genome. However the present tools do not execute either a quality control or a customizable filter of the annotated features is available. FINDINGS: gff2sequence is a program that extracts nucleotide/protein sequences from a genomic multifasta by using the information provided by a general feature format file. While a graphical user interface makes this software very easy to use, a C++ algorithm allows high performance together with low hardware demand. The software also allows the extraction of the genic portions such as the untranslated and the coding sequences. Moreover a highly customizable quality control pipeline can be used to deal with anomalous splicing sites, incorrect open reading frames and not canonical characters within the retrieved sequences. CONCLUSIONS: gff2sequence is a user friendly program that allows the generation of highly customizable sequence datasets by processing a general feature format file. The presence of a wide range of quality filters makes this tool also suitable for refining the ab initio gene predictions.
format Online
Article
Text
id pubmed-3848729
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38487292013-12-04 gff2sequence, a new user friendly tool for the generation of genomic sequences Camiolo, Salvatore Porceddu, Andrea BioData Min Software Article BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been developed to extract the corresponding sequence information from the original genome. However the present tools do not execute either a quality control or a customizable filter of the annotated features is available. FINDINGS: gff2sequence is a program that extracts nucleotide/protein sequences from a genomic multifasta by using the information provided by a general feature format file. While a graphical user interface makes this software very easy to use, a C++ algorithm allows high performance together with low hardware demand. The software also allows the extraction of the genic portions such as the untranslated and the coding sequences. Moreover a highly customizable quality control pipeline can be used to deal with anomalous splicing sites, incorrect open reading frames and not canonical characters within the retrieved sequences. CONCLUSIONS: gff2sequence is a user friendly program that allows the generation of highly customizable sequence datasets by processing a general feature format file. The presence of a wide range of quality filters makes this tool also suitable for refining the ab initio gene predictions. BioMed Central 2013-09-11 /pmc/articles/PMC3848729/ /pubmed/24020993 http://dx.doi.org/10.1186/1756-0381-6-15 Text en Copyright © 2013 Camiolo and Porceddu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Article
Camiolo, Salvatore
Porceddu, Andrea
gff2sequence, a new user friendly tool for the generation of genomic sequences
title gff2sequence, a new user friendly tool for the generation of genomic sequences
title_full gff2sequence, a new user friendly tool for the generation of genomic sequences
title_fullStr gff2sequence, a new user friendly tool for the generation of genomic sequences
title_full_unstemmed gff2sequence, a new user friendly tool for the generation of genomic sequences
title_short gff2sequence, a new user friendly tool for the generation of genomic sequences
title_sort gff2sequence, a new user friendly tool for the generation of genomic sequences
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848729/
https://www.ncbi.nlm.nih.gov/pubmed/24020993
http://dx.doi.org/10.1186/1756-0381-6-15
work_keys_str_mv AT camiolosalvatore gff2sequenceanewuserfriendlytoolforthegenerationofgenomicsequences
AT porcedduandrea gff2sequenceanewuserfriendlytoolforthegenerationofgenomicsequences