Cargando…

Scientific evidence based rare disease research discovery with research funding data in knowledge graph

BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Qian, Nguyễn, Ðắc-Trung, Sheils, Timothy, Alyea, Gioconda, Sid, Eric, Xu, Yanji, Dickens, James, Mathé, Ewy A., Pariser, Anne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600882/
https://www.ncbi.nlm.nih.gov/pubmed/34794473
http://dx.doi.org/10.1186/s13023-021-02120-9
_version_ 1784601237843345408
author Zhu, Qian
Nguyễn, Ðắc-Trung
Sheils, Timothy
Alyea, Gioconda
Sid, Eric
Xu, Yanji
Dickens, James
Mathé, Ewy A.
Pariser, Anne
author_facet Zhu, Qian
Nguyễn, Ðắc-Trung
Sheils, Timothy
Alyea, Gioconda
Sid, Eric
Xu, Yanji
Dickens, James
Mathé, Ewy A.
Pariser, Anne
author_sort Zhu, Qian
collection PubMed
description BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study. METHODS: To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery. RESULTS: Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research. CONCLUSION: We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-021-02120-9.
format Online
Article
Text
id pubmed-8600882
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86008822021-11-19 Scientific evidence based rare disease research discovery with research funding data in knowledge graph Zhu, Qian Nguyễn, Ðắc-Trung Sheils, Timothy Alyea, Gioconda Sid, Eric Xu, Yanji Dickens, James Mathé, Ewy A. Pariser, Anne Orphanet J Rare Dis Research BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study. METHODS: To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery. RESULTS: Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research. CONCLUSION: We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-021-02120-9. BioMed Central 2021-11-18 /pmc/articles/PMC8600882/ /pubmed/34794473 http://dx.doi.org/10.1186/s13023-021-02120-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhu, Qian
Nguyễn, Ðắc-Trung
Sheils, Timothy
Alyea, Gioconda
Sid, Eric
Xu, Yanji
Dickens, James
Mathé, Ewy A.
Pariser, Anne
Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title_full Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title_fullStr Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title_full_unstemmed Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title_short Scientific evidence based rare disease research discovery with research funding data in knowledge graph
title_sort scientific evidence based rare disease research discovery with research funding data in knowledge graph
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600882/
https://www.ncbi.nlm.nih.gov/pubmed/34794473
http://dx.doi.org/10.1186/s13023-021-02120-9
work_keys_str_mv AT zhuqian scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT nguyenðactrung scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT sheilstimothy scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT alyeagioconda scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT sideric scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT xuyanji scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT dickensjames scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT matheewya scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph
AT pariseranne scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph