Cargando…

Metadata retrieval from sequence databases with ffq

MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, call...

Descripción completa

Detalles Bibliográficos
Autores principales: Gálvez-Merchán, Ángel, Min, Kyung Hoi (Joseph), Pachter, Lior, Booeshaghi, A Sina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883619/
https://www.ncbi.nlm.nih.gov/pubmed/36610997
http://dx.doi.org/10.1093/bioinformatics/btac667
_version_ 1784879547634679808
author Gálvez-Merchán, Ángel
Min, Kyung Hoi (Joseph)
Pachter, Lior
Booeshaghi, A Sina
author_facet Gálvez-Merchán, Ángel
Min, Kyung Hoi (Joseph)
Pachter, Lior
Booeshaghi, A Sina
author_sort Gálvez-Merchán, Ángel
collection PubMed
description MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper’s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq’s modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. AVAILABILITY AND IMPLEMENTATION: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq.
format Online
Article
Text
id pubmed-9883619
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98836192023-01-31 Metadata retrieval from sequence databases with ffq Gálvez-Merchán, Ángel Min, Kyung Hoi (Joseph) Pachter, Lior Booeshaghi, A Sina Bioinformatics Applications Note MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper’s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq’s modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. AVAILABILITY AND IMPLEMENTATION: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq. Oxford University Press 2023-01-05 /pmc/articles/PMC9883619/ /pubmed/36610997 http://dx.doi.org/10.1093/bioinformatics/btac667 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Gálvez-Merchán, Ángel
Min, Kyung Hoi (Joseph)
Pachter, Lior
Booeshaghi, A Sina
Metadata retrieval from sequence databases with ffq
title Metadata retrieval from sequence databases with ffq
title_full Metadata retrieval from sequence databases with ffq
title_fullStr Metadata retrieval from sequence databases with ffq
title_full_unstemmed Metadata retrieval from sequence databases with ffq
title_short Metadata retrieval from sequence databases with ffq
title_sort metadata retrieval from sequence databases with ffq
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883619/
https://www.ncbi.nlm.nih.gov/pubmed/36610997
http://dx.doi.org/10.1093/bioinformatics/btac667
work_keys_str_mv AT galvezmerchanangel metadataretrievalfromsequencedatabaseswithffq
AT minkyunghoijoseph metadataretrievalfromsequencedatabaseswithffq
AT pachterlior metadataretrievalfromsequencedatabaseswithffq
AT booeshaghiasina metadataretrievalfromsequencedatabaseswithffq