Cargando…

Exploring Integrative Analysis Using the BioMedical Evidence Graph

PURPOSE: The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from...

Descripción completa

Detalles Bibliográficos
Autores principales: Struck, Adam, Walsh, Brian, Buchanan, Alexander, Lee, Jordan A., Spangler, Ryan, Stuart, Joshua M., Ellrott, Kyle
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society of Clinical Oncology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049249/
https://www.ncbi.nlm.nih.gov/pubmed/32097025
http://dx.doi.org/10.1200/CCI.19.00110
_version_ 1783502402924576768
author Struck, Adam
Walsh, Brian
Buchanan, Alexander
Lee, Jordan A.
Spangler, Ryan
Stuart, Joshua M.
Ellrott, Kyle
author_facet Struck, Adam
Walsh, Brian
Buchanan, Alexander
Lee, Jordan A.
Spangler, Ryan
Stuart, Joshua M.
Ellrott, Kyle
author_sort Struck, Adam
collection PubMed
description PURPOSE: The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis. METHODS: We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations. RESULTS: The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query–based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross–data set analysis to show the utility of the system. CONCLUSION: The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available.
format Online
Article
Text
id pubmed-7049249
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Society of Clinical Oncology
record_format MEDLINE/PubMed
spelling pubmed-70492492021-02-25 Exploring Integrative Analysis Using the BioMedical Evidence Graph Struck, Adam Walsh, Brian Buchanan, Alexander Lee, Jordan A. Spangler, Ryan Stuart, Joshua M. Ellrott, Kyle JCO Clin Cancer Inform Original Reports PURPOSE: The analysis of cancer biology data involves extremely heterogeneous data sets, including information from RNA sequencing, genome-wide copy number, DNA methylation data reporting on epigenetic regulation, somatic mutations from whole-exome or whole-genome analyses, pathology estimates from imaging sections or subtyping, drug response or other treatment outcomes, and various other clinical and phenotypic measurements. Bringing these different resources into a common framework, with a data model that allows for complex relationships as well as dense vectors of features, will unlock integrated data set analysis. METHODS: We introduce the BioMedical Evidence Graph (BMEG), a graph database and query engine for discovery and analysis of cancer biology. The BMEG is unique from other biologic data graphs in that sample-level molecular and clinical information is connected to reference knowledge bases. It combines gene expression and mutation data with drug-response experiments, pathway information databases, and literature-derived associations. RESULTS: The construction of the BMEG has resulted in a graph containing > 41 million vertices and 57 million edges. The BMEG system provides a graph query–based application programming interface to enable analysis, with client code available for Python, Javascript, and R, and a server online at bmeg.io. Using this system, we have demonstrated several forms of cross–data set analysis to show the utility of the system. CONCLUSION: The BMEG is an evolving resource dedicated to enabling integrative analysis. We have demonstrated queries on the system that illustrate mutation significance analysis, drug-response machine learning, patient-level knowledge-base queries, and pathway level analysis. We have compared the resulting graph to other available integrated graph systems and demonstrated the former is unique in the scale of the graph and the type of data it makes available. American Society of Clinical Oncology 2020-02-25 /pmc/articles/PMC7049249/ /pubmed/32097025 http://dx.doi.org/10.1200/CCI.19.00110 Text en © 2020 by American Society of Clinical Oncology https://creativecommons.org/licenses/by/4.0/ Licensed under the Creative Commons Attribution 4.0 License: https://creativecommons.org/licenses/by/4.0/
spellingShingle Original Reports
Struck, Adam
Walsh, Brian
Buchanan, Alexander
Lee, Jordan A.
Spangler, Ryan
Stuart, Joshua M.
Ellrott, Kyle
Exploring Integrative Analysis Using the BioMedical Evidence Graph
title Exploring Integrative Analysis Using the BioMedical Evidence Graph
title_full Exploring Integrative Analysis Using the BioMedical Evidence Graph
title_fullStr Exploring Integrative Analysis Using the BioMedical Evidence Graph
title_full_unstemmed Exploring Integrative Analysis Using the BioMedical Evidence Graph
title_short Exploring Integrative Analysis Using the BioMedical Evidence Graph
title_sort exploring integrative analysis using the biomedical evidence graph
topic Original Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7049249/
https://www.ncbi.nlm.nih.gov/pubmed/32097025
http://dx.doi.org/10.1200/CCI.19.00110
work_keys_str_mv AT struckadam exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT walshbrian exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT buchananalexander exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT leejordana exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT spanglerryan exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT stuartjoshuam exploringintegrativeanalysisusingthebiomedicalevidencegraph
AT ellrottkyle exploringintegrativeanalysisusingthebiomedicalevidencegraph