Cargando…
Automated gene data integration with Databio
OBJECTIVE: Although sequencing and other high-throughput data production technologies are increasingly affordable, data analysis and interpretation remains a significant factor in the cost of -omics studies. Despite the broad acceptance of findable, accessible, interoperable, and reusable (FAIR) dat...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7110638/ https://www.ncbi.nlm.nih.gov/pubmed/32238171 http://dx.doi.org/10.1186/s13104-020-05038-w |
_version_ | 1783513092310695936 |
---|---|
author | Reid, Robert W. Ferrier, Jacob W. Jay, Jeremy J. |
author_facet | Reid, Robert W. Ferrier, Jacob W. Jay, Jeremy J. |
author_sort | Reid, Robert W. |
collection | PubMed |
description | OBJECTIVE: Although sequencing and other high-throughput data production technologies are increasingly affordable, data analysis and interpretation remains a significant factor in the cost of -omics studies. Despite the broad acceptance of findable, accessible, interoperable, and reusable (FAIR) data principles which focus on data discoverability and annotation, data integration remains a significant bottleneck in linking prior work in order to better understand novel research. Relevant and timely information discovery is difficult for increasingly multi-disciplinary projects when scientists cannot easily keep up with work across multiple fields. Computational tools are necessary to accurately describe data contents, and empower linkage to existing resources without prior knowledge of the various database resources. RESULTS: We developed the Databio tool, accessible at https://datab.io/, to automate data parsing, identifier detection, and streamline common tasks to provide a point-and-click approach to data manipulation and integration in life sciences research and translational medicine. Databio uses fast real-time data structures and a data warehouse of 137 million identifiers, with automated heuristics to describe data provenance without highly specialized knowledge or bioinformatics training. |
format | Online Article Text |
id | pubmed-7110638 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-71106382020-04-07 Automated gene data integration with Databio Reid, Robert W. Ferrier, Jacob W. Jay, Jeremy J. BMC Res Notes Research Note OBJECTIVE: Although sequencing and other high-throughput data production technologies are increasingly affordable, data analysis and interpretation remains a significant factor in the cost of -omics studies. Despite the broad acceptance of findable, accessible, interoperable, and reusable (FAIR) data principles which focus on data discoverability and annotation, data integration remains a significant bottleneck in linking prior work in order to better understand novel research. Relevant and timely information discovery is difficult for increasingly multi-disciplinary projects when scientists cannot easily keep up with work across multiple fields. Computational tools are necessary to accurately describe data contents, and empower linkage to existing resources without prior knowledge of the various database resources. RESULTS: We developed the Databio tool, accessible at https://datab.io/, to automate data parsing, identifier detection, and streamline common tasks to provide a point-and-click approach to data manipulation and integration in life sciences research and translational medicine. Databio uses fast real-time data structures and a data warehouse of 137 million identifiers, with automated heuristics to describe data provenance without highly specialized knowledge or bioinformatics training. BioMed Central 2020-04-01 /pmc/articles/PMC7110638/ /pubmed/32238171 http://dx.doi.org/10.1186/s13104-020-05038-w Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Note Reid, Robert W. Ferrier, Jacob W. Jay, Jeremy J. Automated gene data integration with Databio |
title | Automated gene data integration with Databio |
title_full | Automated gene data integration with Databio |
title_fullStr | Automated gene data integration with Databio |
title_full_unstemmed | Automated gene data integration with Databio |
title_short | Automated gene data integration with Databio |
title_sort | automated gene data integration with databio |
topic | Research Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7110638/ https://www.ncbi.nlm.nih.gov/pubmed/32238171 http://dx.doi.org/10.1186/s13104-020-05038-w |
work_keys_str_mv | AT reidrobertw automatedgenedataintegrationwithdatabio AT ferrierjacobw automatedgenedataintegrationwithdatabio AT jayjeremyj automatedgenedataintegrationwithdatabio |