Cargando…
Scientific evidence based rare disease research discovery with research funding data in knowledge graph
BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600882/ https://www.ncbi.nlm.nih.gov/pubmed/34794473 http://dx.doi.org/10.1186/s13023-021-02120-9 |
_version_ | 1784601237843345408 |
---|---|
author | Zhu, Qian Nguyễn, Ðắc-Trung Sheils, Timothy Alyea, Gioconda Sid, Eric Xu, Yanji Dickens, James Mathé, Ewy A. Pariser, Anne |
author_facet | Zhu, Qian Nguyễn, Ðắc-Trung Sheils, Timothy Alyea, Gioconda Sid, Eric Xu, Yanji Dickens, James Mathé, Ewy A. Pariser, Anne |
author_sort | Zhu, Qian |
collection | PubMed |
description | BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study. METHODS: To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery. RESULTS: Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research. CONCLUSION: We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-021-02120-9. |
format | Online Article Text |
id | pubmed-8600882 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86008822021-11-19 Scientific evidence based rare disease research discovery with research funding data in knowledge graph Zhu, Qian Nguyễn, Ðắc-Trung Sheils, Timothy Alyea, Gioconda Sid, Eric Xu, Yanji Dickens, James Mathé, Ewy A. Pariser, Anne Orphanet J Rare Dis Research BACKGROUND: Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study. METHODS: To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery. RESULTS: Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research. CONCLUSION: We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13023-021-02120-9. BioMed Central 2021-11-18 /pmc/articles/PMC8600882/ /pubmed/34794473 http://dx.doi.org/10.1186/s13023-021-02120-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Zhu, Qian Nguyễn, Ðắc-Trung Sheils, Timothy Alyea, Gioconda Sid, Eric Xu, Yanji Dickens, James Mathé, Ewy A. Pariser, Anne Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title | Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title_full | Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title_fullStr | Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title_full_unstemmed | Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title_short | Scientific evidence based rare disease research discovery with research funding data in knowledge graph |
title_sort | scientific evidence based rare disease research discovery with research funding data in knowledge graph |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8600882/ https://www.ncbi.nlm.nih.gov/pubmed/34794473 http://dx.doi.org/10.1186/s13023-021-02120-9 |
work_keys_str_mv | AT zhuqian scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT nguyenðactrung scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT sheilstimothy scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT alyeagioconda scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT sideric scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT xuyanji scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT dickensjames scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT matheewya scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph AT pariseranne scientificevidencebasedrarediseaseresearchdiscoverywithresearchfundingdatainknowledgegraph |