Cargando…
Metadata retrieval from sequence databases with ffq
MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, call...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883619/ https://www.ncbi.nlm.nih.gov/pubmed/36610997 http://dx.doi.org/10.1093/bioinformatics/btac667 |
_version_ | 1784879547634679808 |
---|---|
author | Gálvez-Merchán, Ángel Min, Kyung Hoi (Joseph) Pachter, Lior Booeshaghi, A Sina |
author_facet | Gálvez-Merchán, Ángel Min, Kyung Hoi (Joseph) Pachter, Lior Booeshaghi, A Sina |
author_sort | Gálvez-Merchán, Ángel |
collection | PubMed |
description | MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper’s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq’s modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. AVAILABILITY AND IMPLEMENTATION: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq. |
format | Online Article Text |
id | pubmed-9883619 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98836192023-01-31 Metadata retrieval from sequence databases with ffq Gálvez-Merchán, Ángel Min, Kyung Hoi (Joseph) Pachter, Lior Booeshaghi, A Sina Bioinformatics Applications Note MOTIVATION: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. RESULTS: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper’s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq’s modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. AVAILABILITY AND IMPLEMENTATION: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq. Oxford University Press 2023-01-05 /pmc/articles/PMC9883619/ /pubmed/36610997 http://dx.doi.org/10.1093/bioinformatics/btac667 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Gálvez-Merchán, Ángel Min, Kyung Hoi (Joseph) Pachter, Lior Booeshaghi, A Sina Metadata retrieval from sequence databases with ffq |
title | Metadata retrieval from sequence databases with ffq |
title_full | Metadata retrieval from sequence databases with ffq |
title_fullStr | Metadata retrieval from sequence databases with ffq |
title_full_unstemmed | Metadata retrieval from sequence databases with ffq |
title_short | Metadata retrieval from sequence databases with ffq |
title_sort | metadata retrieval from sequence databases with ffq |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883619/ https://www.ncbi.nlm.nih.gov/pubmed/36610997 http://dx.doi.org/10.1093/bioinformatics/btac667 |
work_keys_str_mv | AT galvezmerchanangel metadataretrievalfromsequencedatabaseswithffq AT minkyunghoijoseph metadataretrievalfromsequencedatabaseswithffq AT pachterlior metadataretrievalfromsequencedatabaseswithffq AT booeshaghiasina metadataretrievalfromsequencedatabaseswithffq |