Cargando…

EpiGraphDB: a database and data mining platform for health data science

MOTIVATION: The wealth of data resources on human phenotypes, risk factors, molecular traits and therapeutic interventions presents new opportunities for population health sciences. These opportunities are paralleled by a growing need for data integration, curation and mining to increase research ef...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yi, Elsworth, Benjamin, Erola, Pau, Haberland, Valeriia, Hemani, Gibran, Lyon, Matt, Zheng, Jie, Lloyd, Oliver, Vabistsevits, Marina, Gaunt, Tom R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8189674/
https://www.ncbi.nlm.nih.gov/pubmed/33165574
http://dx.doi.org/10.1093/bioinformatics/btaa961
_version_ 1783705534196613120
author Liu, Yi
Elsworth, Benjamin
Erola, Pau
Haberland, Valeriia
Hemani, Gibran
Lyon, Matt
Zheng, Jie
Lloyd, Oliver
Vabistsevits, Marina
Gaunt, Tom R
author_facet Liu, Yi
Elsworth, Benjamin
Erola, Pau
Haberland, Valeriia
Hemani, Gibran
Lyon, Matt
Zheng, Jie
Lloyd, Oliver
Vabistsevits, Marina
Gaunt, Tom R
author_sort Liu, Yi
collection PubMed
description MOTIVATION: The wealth of data resources on human phenotypes, risk factors, molecular traits and therapeutic interventions presents new opportunities for population health sciences. These opportunities are paralleled by a growing need for data integration, curation and mining to increase research efficiency, reduce mis-inference and ensure reproducible research. RESULTS: We developed EpiGraphDB (https://epigraphdb.org/), a graph database containing an array of different biomedical and epidemiological relationships and an analytical platform to support their use in human population health data science. In addition, we present three case studies that illustrate the value of this platform. The first uses EpiGraphDB to evaluate potential pleiotropic relationships, addressing mis-inference in systematic causal analysis. In the second case study, we illustrate how protein–protein interaction data offer opportunities to identify new drug targets. The final case study integrates causal inference using Mendelian randomization with relationships mined from the biomedical literature to ‘triangulate’ evidence from different sources. AVAILABILITY AND IMPLEMENTATION: The EpiGraphDB platform is openly available at https://epigraphdb.org. Code for replicating case study results is available at https://github.com/MRCIEU/epigraphdb as Jupyter notebooks using the API, and https://mrcieu.github.io/epigraphdb-r using the R package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8189674
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81896742021-06-10 EpiGraphDB: a database and data mining platform for health data science Liu, Yi Elsworth, Benjamin Erola, Pau Haberland, Valeriia Hemani, Gibran Lyon, Matt Zheng, Jie Lloyd, Oliver Vabistsevits, Marina Gaunt, Tom R Bioinformatics Original Papers MOTIVATION: The wealth of data resources on human phenotypes, risk factors, molecular traits and therapeutic interventions presents new opportunities for population health sciences. These opportunities are paralleled by a growing need for data integration, curation and mining to increase research efficiency, reduce mis-inference and ensure reproducible research. RESULTS: We developed EpiGraphDB (https://epigraphdb.org/), a graph database containing an array of different biomedical and epidemiological relationships and an analytical platform to support their use in human population health data science. In addition, we present three case studies that illustrate the value of this platform. The first uses EpiGraphDB to evaluate potential pleiotropic relationships, addressing mis-inference in systematic causal analysis. In the second case study, we illustrate how protein–protein interaction data offer opportunities to identify new drug targets. The final case study integrates causal inference using Mendelian randomization with relationships mined from the biomedical literature to ‘triangulate’ evidence from different sources. AVAILABILITY AND IMPLEMENTATION: The EpiGraphDB platform is openly available at https://epigraphdb.org. Code for replicating case study results is available at https://github.com/MRCIEU/epigraphdb as Jupyter notebooks using the API, and https://mrcieu.github.io/epigraphdb-r using the R package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-11-24 /pmc/articles/PMC8189674/ /pubmed/33165574 http://dx.doi.org/10.1093/bioinformatics/btaa961 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Liu, Yi
Elsworth, Benjamin
Erola, Pau
Haberland, Valeriia
Hemani, Gibran
Lyon, Matt
Zheng, Jie
Lloyd, Oliver
Vabistsevits, Marina
Gaunt, Tom R
EpiGraphDB: a database and data mining platform for health data science
title EpiGraphDB: a database and data mining platform for health data science
title_full EpiGraphDB: a database and data mining platform for health data science
title_fullStr EpiGraphDB: a database and data mining platform for health data science
title_full_unstemmed EpiGraphDB: a database and data mining platform for health data science
title_short EpiGraphDB: a database and data mining platform for health data science
title_sort epigraphdb: a database and data mining platform for health data science
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8189674/
https://www.ncbi.nlm.nih.gov/pubmed/33165574
http://dx.doi.org/10.1093/bioinformatics/btaa961
work_keys_str_mv AT liuyi epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT elsworthbenjamin epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT erolapau epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT haberlandvaleriia epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT hemanigibran epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT lyonmatt epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT zhengjie epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT lloydoliver epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT vabistsevitsmarina epigraphdbadatabaseanddataminingplatformforhealthdatascience
AT gaunttomr epigraphdbadatabaseanddataminingplatformforhealthdatascience