Cargando…
gff2sequence, a new user friendly tool for the generation of genomic sequences
BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been develop...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848729/ https://www.ncbi.nlm.nih.gov/pubmed/24020993 http://dx.doi.org/10.1186/1756-0381-6-15 |
_version_ | 1782293808523771904 |
---|---|
author | Camiolo, Salvatore Porceddu, Andrea |
author_facet | Camiolo, Salvatore Porceddu, Andrea |
author_sort | Camiolo, Salvatore |
collection | PubMed |
description | BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been developed to extract the corresponding sequence information from the original genome. However the present tools do not execute either a quality control or a customizable filter of the annotated features is available. FINDINGS: gff2sequence is a program that extracts nucleotide/protein sequences from a genomic multifasta by using the information provided by a general feature format file. While a graphical user interface makes this software very easy to use, a C++ algorithm allows high performance together with low hardware demand. The software also allows the extraction of the genic portions such as the untranslated and the coding sequences. Moreover a highly customizable quality control pipeline can be used to deal with anomalous splicing sites, incorrect open reading frames and not canonical characters within the retrieved sequences. CONCLUSIONS: gff2sequence is a user friendly program that allows the generation of highly customizable sequence datasets by processing a general feature format file. The presence of a wide range of quality filters makes this tool also suitable for refining the ab initio gene predictions. |
format | Online Article Text |
id | pubmed-3848729 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38487292013-12-04 gff2sequence, a new user friendly tool for the generation of genomic sequences Camiolo, Salvatore Porceddu, Andrea BioData Min Software Article BACKGROUND: General Feature Format (GFF) files are used to store genome features such as genes, exons, introns, primary transcripts etc. Although many software packages (i.e. ab initio gene prediction programs) can annotate features by using such a standard, a small number of tools have been developed to extract the corresponding sequence information from the original genome. However the present tools do not execute either a quality control or a customizable filter of the annotated features is available. FINDINGS: gff2sequence is a program that extracts nucleotide/protein sequences from a genomic multifasta by using the information provided by a general feature format file. While a graphical user interface makes this software very easy to use, a C++ algorithm allows high performance together with low hardware demand. The software also allows the extraction of the genic portions such as the untranslated and the coding sequences. Moreover a highly customizable quality control pipeline can be used to deal with anomalous splicing sites, incorrect open reading frames and not canonical characters within the retrieved sequences. CONCLUSIONS: gff2sequence is a user friendly program that allows the generation of highly customizable sequence datasets by processing a general feature format file. The presence of a wide range of quality filters makes this tool also suitable for refining the ab initio gene predictions. BioMed Central 2013-09-11 /pmc/articles/PMC3848729/ /pubmed/24020993 http://dx.doi.org/10.1186/1756-0381-6-15 Text en Copyright © 2013 Camiolo and Porceddu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Article Camiolo, Salvatore Porceddu, Andrea gff2sequence, a new user friendly tool for the generation of genomic sequences |
title | gff2sequence, a new user friendly tool for the generation of genomic sequences |
title_full | gff2sequence, a new user friendly tool for the generation of genomic sequences |
title_fullStr | gff2sequence, a new user friendly tool for the generation of genomic sequences |
title_full_unstemmed | gff2sequence, a new user friendly tool for the generation of genomic sequences |
title_short | gff2sequence, a new user friendly tool for the generation of genomic sequences |
title_sort | gff2sequence, a new user friendly tool for the generation of genomic sequences |
topic | Software Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848729/ https://www.ncbi.nlm.nih.gov/pubmed/24020993 http://dx.doi.org/10.1186/1756-0381-6-15 |
work_keys_str_mv | AT camiolosalvatore gff2sequenceanewuserfriendlytoolforthegenerationofgenomicsequences AT porcedduandrea gff2sequenceanewuserfriendlytoolforthegenerationofgenomicsequences |