Cargando…

FUSTA: leveraging FUSE for manipulation of multiFASTA files at scale

MOTIVATION: FASTA files are the de facto standard for sharing, manipulating and storing biological sequences, while concatenated in multiFASTA they tend to be unwieldy for two main reasons: (i) they can become big enough that their manipulation with standard text-editing tools is unpractical, either...

Descripción completa

Detalles Bibliográficos
Autores principales:	Delehelle, Franklin, Roest Crollius, Hugues
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2022
Materias:	Application Note
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9875552/ https://www.ncbi.nlm.nih.gov/pubmed/36713287 http://dx.doi.org/10.1093/bioadv/vbac091

Descripción
Sumario:	MOTIVATION: FASTA files are the de facto standard for sharing, manipulating and storing biological sequences, while concatenated in multiFASTA they tend to be unwieldy for two main reasons: (i) they can become big enough that their manipulation with standard text-editing tools is unpractical, either due to slowness or memory consumption; (ii) by mixing metadata (headers) and data (sequences), bulk operations using standard text streaming tools (such as sed or awk) are impossible without including a parsing step, which may be error-prone and introduce friction in the development process. RESULTS: Here, we present FUSTA (FUse for faSTA), a software utility which makes use of the FUSE technology to expose a multiFASTA file as a hierarchy of virtual files, letting users operate directly on the sequences as independent virtual files through classical file manipulation methods. AVAILABILITY AND IMPLEMENTATION: FUSTA is freely available under the CeCILL-C (LGPLv3-compatible) license at https://github.com/delehef/fusta. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.

FUSTA: leveraging FUSE for manipulation of multiFASTA files at scale

Ejemplares similares