Cargando…

BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology

Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-ann...

Descripción completa

Detalles Bibliográficos
Autores principales: Lakiotaki, Kleanthi, Vorniotakis, Nikolaos, Tsagris, Michail, Georgakopoulos, Georgios, Tsamardinos, Ioannis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836265/
https://www.ncbi.nlm.nih.gov/pubmed/29688366
http://dx.doi.org/10.1093/database/bay011
_version_ 1783303931490729984
author Lakiotaki, Kleanthi
Vorniotakis, Nikolaos
Tsagris, Michail
Georgakopoulos, Georgios
Tsamardinos, Ioannis
author_facet Lakiotaki, Kleanthi
Vorniotakis, Nikolaos
Tsagris, Michail
Georgakopoulos, Georgios
Tsamardinos, Ioannis
author_sort Lakiotaki, Kleanthi
collection PubMed
description Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome’s utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/. Database URL: http://dataome.mensxmachina.org/
format Online
Article
Text
id pubmed-5836265
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58362652019-03-12 BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology Lakiotaki, Kleanthi Vorniotakis, Nikolaos Tsagris, Michail Georgakopoulos, Georgios Tsamardinos, Ioannis Database (Oxford) Database Tool Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome’s utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/. Database URL: http://dataome.mensxmachina.org/ Oxford University Press 2018-03-02 /pmc/articles/PMC5836265/ /pubmed/29688366 http://dx.doi.org/10.1093/database/bay011 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database Tool
Lakiotaki, Kleanthi
Vorniotakis, Nikolaos
Tsagris, Michail
Georgakopoulos, Georgios
Tsamardinos, Ioannis
BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title_full BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title_fullStr BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title_full_unstemmed BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title_short BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
title_sort biodataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
topic Database Tool
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836265/
https://www.ncbi.nlm.nih.gov/pubmed/29688366
http://dx.doi.org/10.1093/database/bay011
work_keys_str_mv AT lakiotakikleanthi biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology
AT vorniotakisnikolaos biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology
AT tsagrismichail biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology
AT georgakopoulosgeorgios biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology
AT tsamardinosioannis biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology