Cargando…

An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)

BACKGROUND: The Genetic and Rare Diseases (GARD) Information Center was established by the National Institutes of Health (NIH) to provide freely accessible consumer health information on over 6500 genetic and rare diseases. As the cumulative scientific understanding and underlying evidence for these...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Qian, Nguyen, Dac-Trung, Grishagin, Ivan, Southall, Noel, Sid, Eric, Pariser, Anne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7663894/
https://www.ncbi.nlm.nih.gov/pubmed/33183351
http://dx.doi.org/10.1186/s13326-020-00232-y
_version_ 1783609732104192000
author Zhu, Qian
Nguyen, Dac-Trung
Grishagin, Ivan
Southall, Noel
Sid, Eric
Pariser, Anne
author_facet Zhu, Qian
Nguyen, Dac-Trung
Grishagin, Ivan
Southall, Noel
Sid, Eric
Pariser, Anne
author_sort Zhu, Qian
collection PubMed
description BACKGROUND: The Genetic and Rare Diseases (GARD) Information Center was established by the National Institutes of Health (NIH) to provide freely accessible consumer health information on over 6500 genetic and rare diseases. As the cumulative scientific understanding and underlying evidence for these diseases have expanded over time, existing practices to generate knowledge from these publications and resources have not been able to keep pace. Through determining the applicability of computational approaches to enhance or replace manual curation tasks, we aim to both improve the sustainability and relevance of consumer health information, but also to develop a foundational database, from which translational science researchers may start to unravel disease characteristics that are vital to the research process. RESULTS: We developed a meta-ontology based integrative knowledge graph for rare diseases in Neo4j. This integrative knowledge graph includes a total of 3,819,623 nodes and 84,223,681 relations from 34 different biomedical data resources, including curated drug and rare disease associations. Semi-automatic mappings were generated for 2154 unique FDA orphan designations to 776 unique GARD diseases, and 3322 unique FDA designated drugs to UNII, as well as 180,363 associations between drug and indication from Inxight Drugs, which were integrated into the knowledge graph. We conducted four case studies to demonstrate the capabilities of this integrative knowledge graph in accelerating the curation of scientific understanding on rare diseases through the generation of disease mappings/profiles and pathogenesis associations. CONCLUSIONS: By integrating well-established database resources, we developed an integrative knowledge graph containing a large volume of biomedical and research data. Demonstration of several immediate use cases and limitations of this process reveal both the potential feasibility and barriers of utilizing graph-based resources and approaches to support their use by providers of consumer health information, such as GARD, that may struggle with the needs of maintaining knowledge reliant on an evolving and growing evidence-base. Finally, the successful integration of these datasets into a freely accessible knowledge graph highlights an opportunity to take a translational science view on the field of rare diseases by enabling researchers to identify disease characteristics, which may play a role in the translation of discover across different research domains.
format Online
Article
Text
id pubmed-7663894
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76638942020-11-13 An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD) Zhu, Qian Nguyen, Dac-Trung Grishagin, Ivan Southall, Noel Sid, Eric Pariser, Anne J Biomed Semantics Research BACKGROUND: The Genetic and Rare Diseases (GARD) Information Center was established by the National Institutes of Health (NIH) to provide freely accessible consumer health information on over 6500 genetic and rare diseases. As the cumulative scientific understanding and underlying evidence for these diseases have expanded over time, existing practices to generate knowledge from these publications and resources have not been able to keep pace. Through determining the applicability of computational approaches to enhance or replace manual curation tasks, we aim to both improve the sustainability and relevance of consumer health information, but also to develop a foundational database, from which translational science researchers may start to unravel disease characteristics that are vital to the research process. RESULTS: We developed a meta-ontology based integrative knowledge graph for rare diseases in Neo4j. This integrative knowledge graph includes a total of 3,819,623 nodes and 84,223,681 relations from 34 different biomedical data resources, including curated drug and rare disease associations. Semi-automatic mappings were generated for 2154 unique FDA orphan designations to 776 unique GARD diseases, and 3322 unique FDA designated drugs to UNII, as well as 180,363 associations between drug and indication from Inxight Drugs, which were integrated into the knowledge graph. We conducted four case studies to demonstrate the capabilities of this integrative knowledge graph in accelerating the curation of scientific understanding on rare diseases through the generation of disease mappings/profiles and pathogenesis associations. CONCLUSIONS: By integrating well-established database resources, we developed an integrative knowledge graph containing a large volume of biomedical and research data. Demonstration of several immediate use cases and limitations of this process reveal both the potential feasibility and barriers of utilizing graph-based resources and approaches to support their use by providers of consumer health information, such as GARD, that may struggle with the needs of maintaining knowledge reliant on an evolving and growing evidence-base. Finally, the successful integration of these datasets into a freely accessible knowledge graph highlights an opportunity to take a translational science view on the field of rare diseases by enabling researchers to identify disease characteristics, which may play a role in the translation of discover across different research domains. BioMed Central 2020-11-12 /pmc/articles/PMC7663894/ /pubmed/33183351 http://dx.doi.org/10.1186/s13326-020-00232-y Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Zhu, Qian
Nguyen, Dac-Trung
Grishagin, Ivan
Southall, Noel
Sid, Eric
Pariser, Anne
An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title_full An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title_fullStr An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title_full_unstemmed An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title_short An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD)
title_sort integrative knowledge graph for rare diseases, derived from the genetic and rare diseases information center (gard)
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7663894/
https://www.ncbi.nlm.nih.gov/pubmed/33183351
http://dx.doi.org/10.1186/s13326-020-00232-y
work_keys_str_mv AT zhuqian anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT nguyendactrung anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT grishaginivan anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT southallnoel anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT sideric anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT pariseranne anintegrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT zhuqian integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT nguyendactrung integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT grishaginivan integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT southallnoel integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT sideric integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard
AT pariseranne integrativeknowledgegraphforrarediseasesderivedfromthegeneticandrarediseasesinformationcentergard