Cargando…
BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology
Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-ann...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836265/ https://www.ncbi.nlm.nih.gov/pubmed/29688366 http://dx.doi.org/10.1093/database/bay011 |
_version_ | 1783303931490729984 |
---|---|
author | Lakiotaki, Kleanthi Vorniotakis, Nikolaos Tsagris, Michail Georgakopoulos, Georgios Tsamardinos, Ioannis |
author_facet | Lakiotaki, Kleanthi Vorniotakis, Nikolaos Tsagris, Michail Georgakopoulos, Georgios Tsamardinos, Ioannis |
author_sort | Lakiotaki, Kleanthi |
collection | PubMed |
description | Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome’s utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/. Database URL: http://dataome.mensxmachina.org/ |
format | Online Article Text |
id | pubmed-5836265 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58362652019-03-12 BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology Lakiotaki, Kleanthi Vorniotakis, Nikolaos Tsagris, Michail Georgakopoulos, Georgios Tsamardinos, Ioannis Database (Oxford) Database Tool Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome’s utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/. Database URL: http://dataome.mensxmachina.org/ Oxford University Press 2018-03-02 /pmc/articles/PMC5836265/ /pubmed/29688366 http://dx.doi.org/10.1093/database/bay011 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Database Tool Lakiotaki, Kleanthi Vorniotakis, Nikolaos Tsagris, Michail Georgakopoulos, Georgios Tsamardinos, Ioannis BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title | BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title_full | BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title_fullStr | BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title_full_unstemmed | BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title_short | BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
title_sort | biodataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology |
topic | Database Tool |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5836265/ https://www.ncbi.nlm.nih.gov/pubmed/29688366 http://dx.doi.org/10.1093/database/bay011 |
work_keys_str_mv | AT lakiotakikleanthi biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology AT vorniotakisnikolaos biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology AT tsagrismichail biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology AT georgakopoulosgeorgios biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology AT tsamardinosioannis biodataomeacollectionofuniformlypreprocessedandautomaticallyannotateddatasetsfordatadrivenbiology |