Cargando…
FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow
Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8535801/ https://www.ncbi.nlm.nih.gov/pubmed/34681040 http://dx.doi.org/10.3390/genes12101645 |
_version_ | 1784587871434309632 |
---|---|
author | Vlasova, Anna Hermoso Pulido, Toni Camara, Francisco Ponomarenko, Julia Guigó, Roderic |
author_facet | Vlasova, Anna Hermoso Pulido, Toni Camara, Francisco Ponomarenko, Julia Guigó, Roderic |
author_sort | Vlasova, Anna |
collection | PubMed |
description | Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility. |
format | Online Article Text |
id | pubmed-8535801 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-85358012021-10-23 FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow Vlasova, Anna Hermoso Pulido, Toni Camara, Francisco Ponomarenko, Julia Guigó, Roderic Genes (Basel) Article Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility. MDPI 2021-10-19 /pmc/articles/PMC8535801/ /pubmed/34681040 http://dx.doi.org/10.3390/genes12101645 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Vlasova, Anna Hermoso Pulido, Toni Camara, Francisco Ponomarenko, Julia Guigó, Roderic FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title | FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title_full | FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title_fullStr | FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title_full_unstemmed | FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title_short | FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |
title_sort | fa-nf: a functional annotation pipeline for proteins from non-model organisms implemented in nextflow |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8535801/ https://www.ncbi.nlm.nih.gov/pubmed/34681040 http://dx.doi.org/10.3390/genes12101645 |
work_keys_str_mv | AT vlasovaanna fanfafunctionalannotationpipelineforproteinsfromnonmodelorganismsimplementedinnextflow AT hermosopulidotoni fanfafunctionalannotationpipelineforproteinsfromnonmodelorganismsimplementedinnextflow AT camarafrancisco fanfafunctionalannotationpipelineforproteinsfromnonmodelorganismsimplementedinnextflow AT ponomarenkojulia fanfafunctionalannotationpipelineforproteinsfromnonmodelorganismsimplementedinnextflow AT guigoroderic fanfafunctionalannotationpipelineforproteinsfromnonmodelorganismsimplementedinnextflow |