Cargando…
A decoupled, modular and scriptable architecture for tools to curate data platforms
MOTIVATION: Curation is essential for any data platform to maintain the quality of the data it provides. Today, more effective curation tools are often vital to keep up with the rapid growth of existing, maintenance-requiring databases and the amount of newly published information that needs to be s...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545344/ https://www.ncbi.nlm.nih.gov/pubmed/33830216 http://dx.doi.org/10.1093/bioinformatics/btab233 |
_version_ | 1784589996808732672 |
---|---|
author | Langenstein, Momo Hermjakob, Henning Bernal Llinares, Manuel |
author_facet | Langenstein, Momo Hermjakob, Henning Bernal Llinares, Manuel |
author_sort | Langenstein, Momo |
collection | PubMed |
description | MOTIVATION: Curation is essential for any data platform to maintain the quality of the data it provides. Today, more effective curation tools are often vital to keep up with the rapid growth of existing, maintenance-requiring databases and the amount of newly published information that needs to be surveyed. However, curation interfaces are often complex and challenging to be further developed. Therefore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources or a reluctance to change sensitive production systems. RESULTS: We propose a decoupled, modular and scriptable architecture to build new curation tools on top of existing platforms. Our architecture treats the existing platform as a black box. It, therefore, only relies on its public application programming interfaces and web application instead of requiring any changes to the existing infrastructure. As a case study, we have implemented this architecture in cmd-iaso, a curation tool for the identifiers.org registry. With cmd-iaso, we also show that the proposed design’s flexibility can be utilized to streamline and enhance the curator’s workflow with the platform’s existing web interface. AVAILABILITYAND IMPLEMENTATION: The cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/cmd-iaso. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8545344 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-85453442021-10-26 A decoupled, modular and scriptable architecture for tools to curate data platforms Langenstein, Momo Hermjakob, Henning Bernal Llinares, Manuel Bioinformatics Applications Notes MOTIVATION: Curation is essential for any data platform to maintain the quality of the data it provides. Today, more effective curation tools are often vital to keep up with the rapid growth of existing, maintenance-requiring databases and the amount of newly published information that needs to be surveyed. However, curation interfaces are often complex and challenging to be further developed. Therefore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources or a reluctance to change sensitive production systems. RESULTS: We propose a decoupled, modular and scriptable architecture to build new curation tools on top of existing platforms. Our architecture treats the existing platform as a black box. It, therefore, only relies on its public application programming interfaces and web application instead of requiring any changes to the existing infrastructure. As a case study, we have implemented this architecture in cmd-iaso, a curation tool for the identifiers.org registry. With cmd-iaso, we also show that the proposed design’s flexibility can be utilized to streamline and enhance the curator’s workflow with the platform’s existing web interface. AVAILABILITYAND IMPLEMENTATION: The cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/cmd-iaso. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-04-08 /pmc/articles/PMC8545344/ /pubmed/33830216 http://dx.doi.org/10.1093/bioinformatics/btab233 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Langenstein, Momo Hermjakob, Henning Bernal Llinares, Manuel A decoupled, modular and scriptable architecture for tools to curate data platforms |
title | A decoupled, modular and scriptable architecture for tools to curate data platforms |
title_full | A decoupled, modular and scriptable architecture for tools to curate data platforms |
title_fullStr | A decoupled, modular and scriptable architecture for tools to curate data platforms |
title_full_unstemmed | A decoupled, modular and scriptable architecture for tools to curate data platforms |
title_short | A decoupled, modular and scriptable architecture for tools to curate data platforms |
title_sort | decoupled, modular and scriptable architecture for tools to curate data platforms |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545344/ https://www.ncbi.nlm.nih.gov/pubmed/33830216 http://dx.doi.org/10.1093/bioinformatics/btab233 |
work_keys_str_mv | AT langensteinmomo adecoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms AT hermjakobhenning adecoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms AT bernalllinaresmanuel adecoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms AT langensteinmomo decoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms AT hermjakobhenning decoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms AT bernalllinaresmanuel decoupledmodularandscriptablearchitecturefortoolstocuratedataplatforms |