Cargando…

Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study

BACKGROUND: Although many efforts have been made to develop comprehensive disease resources that capture rare disease information for the purpose of clinical decision making and education, there is no standardized protocol for defining and harmonizing rare diseases across multiple resources. This in...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Qian, Nguyen, Dac-Trung, Alyea, Gioconda, Hanson, Karen, Sid, Eric, Pariser, Anne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7568218/
https://www.ncbi.nlm.nih.gov/pubmed/33006565
http://dx.doi.org/10.2196/18395
_version_ 1783596486469091328
author Zhu, Qian
Nguyen, Dac-Trung
Alyea, Gioconda
Hanson, Karen
Sid, Eric
Pariser, Anne
author_facet Zhu, Qian
Nguyen, Dac-Trung
Alyea, Gioconda
Hanson, Karen
Sid, Eric
Pariser, Anne
author_sort Zhu, Qian
collection PubMed
description BACKGROUND: Although many efforts have been made to develop comprehensive disease resources that capture rare disease information for the purpose of clinical decision making and education, there is no standardized protocol for defining and harmonizing rare diseases across multiple resources. This introduces data redundancy and inconsistency that may ultimately increase confusion and difficulty for the wide use of these resources. To overcome such encumbrances, we report our preliminary study to identify phenotypical similarity among genetic and rare diseases (GARD) that are presenting similar clinical manifestations, and support further data harmonization. OBJECTIVE: To support rare disease data harmonization, we aim to systematically identify phenotypically similar GARD diseases from a disease-oriented integrative knowledge graph and determine their similarity types. METHODS: We identified phenotypically similar GARD diseases programmatically with 2 methods: (1) We measured disease similarity by comparing disease mappings between GARD and other rare disease resources, incorporating manual assessment; 2) we derived clinical manifestations presenting among sibling diseases from disease classifications and prioritized the identified similar diseases based on their phenotypes and genotypes. RESULTS: For disease similarity comparison, approximately 87% (341/392) identified, phenotypically similar disease pairs were validated; 80% (271/392) of these disease pairs were accurately identified as phenotypically similar based on similarity score. The evaluation result shows a high precision (94%) and a satisfactory quality (86% F measure). By deriving phenotypical similarity from Monarch Disease Ontology (MONDO) and Orphanet disease classification trees, we identified a total of 360 disease pairs with at least 1 shared clinical phenotype and gene, which were applied for prioritizing clinical relevance. A total of 662 phenotypically similar disease pairs were identified and will be applied for GARD data harmonization. CONCLUSIONS: We successfully identified phenotypically similar rare diseases among the GARD diseases via 2 approaches, disease mapping comparison and phenotypical similarity derivation from disease classification systems. The results will not only direct GARD data harmonization in expanding translational science research but will also accelerate data transparency and consistency across different disease resources and terminologies, helping to build a robust and up-to-date knowledge resource on rare diseases.
format Online
Article
Text
id pubmed-7568218
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-75682182020-11-02 Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study Zhu, Qian Nguyen, Dac-Trung Alyea, Gioconda Hanson, Karen Sid, Eric Pariser, Anne JMIR Med Inform Original Paper BACKGROUND: Although many efforts have been made to develop comprehensive disease resources that capture rare disease information for the purpose of clinical decision making and education, there is no standardized protocol for defining and harmonizing rare diseases across multiple resources. This introduces data redundancy and inconsistency that may ultimately increase confusion and difficulty for the wide use of these resources. To overcome such encumbrances, we report our preliminary study to identify phenotypical similarity among genetic and rare diseases (GARD) that are presenting similar clinical manifestations, and support further data harmonization. OBJECTIVE: To support rare disease data harmonization, we aim to systematically identify phenotypically similar GARD diseases from a disease-oriented integrative knowledge graph and determine their similarity types. METHODS: We identified phenotypically similar GARD diseases programmatically with 2 methods: (1) We measured disease similarity by comparing disease mappings between GARD and other rare disease resources, incorporating manual assessment; 2) we derived clinical manifestations presenting among sibling diseases from disease classifications and prioritized the identified similar diseases based on their phenotypes and genotypes. RESULTS: For disease similarity comparison, approximately 87% (341/392) identified, phenotypically similar disease pairs were validated; 80% (271/392) of these disease pairs were accurately identified as phenotypically similar based on similarity score. The evaluation result shows a high precision (94%) and a satisfactory quality (86% F measure). By deriving phenotypical similarity from Monarch Disease Ontology (MONDO) and Orphanet disease classification trees, we identified a total of 360 disease pairs with at least 1 shared clinical phenotype and gene, which were applied for prioritizing clinical relevance. A total of 662 phenotypically similar disease pairs were identified and will be applied for GARD data harmonization. CONCLUSIONS: We successfully identified phenotypically similar rare diseases among the GARD diseases via 2 approaches, disease mapping comparison and phenotypical similarity derivation from disease classification systems. The results will not only direct GARD data harmonization in expanding translational science research but will also accelerate data transparency and consistency across different disease resources and terminologies, helping to build a robust and up-to-date knowledge resource on rare diseases. JMIR Publications 2020-10-02 /pmc/articles/PMC7568218/ /pubmed/33006565 http://dx.doi.org/10.2196/18395 Text en ©Qian Zhu, Dac-Trung Nguyen, Gioconda Alyea, Karen Hanson, Eric Sid, Anne Pariser. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 02.10.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Zhu, Qian
Nguyen, Dac-Trung
Alyea, Gioconda
Hanson, Karen
Sid, Eric
Pariser, Anne
Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title_full Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title_fullStr Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title_full_unstemmed Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title_short Phenotypically Similar Rare Disease Identification from an Integrative Knowledge Graph for Data Harmonization: Preliminary Study
title_sort phenotypically similar rare disease identification from an integrative knowledge graph for data harmonization: preliminary study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7568218/
https://www.ncbi.nlm.nih.gov/pubmed/33006565
http://dx.doi.org/10.2196/18395
work_keys_str_mv AT zhuqian phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy
AT nguyendactrung phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy
AT alyeagioconda phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy
AT hansonkaren phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy
AT sideric phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy
AT pariseranne phenotypicallysimilarrarediseaseidentificationfromanintegrativeknowledgegraphfordataharmonizationpreliminarystudy