Cargando…

Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web

BACKGROUND: Clinical phenotypes and disease-risk stratification are most often determined through the direct observations of clinicians in conjunction with published standards and guidelines, where the clinical expert is the final arbiter of the patient’s classification. While this "human"...

Descripción completa

Detalles Bibliográficos
Autores principales: Samadian, Soroush, McManus, Bruce, Wilkinson, Mark D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639885/
https://www.ncbi.nlm.nih.gov/pubmed/22818710
http://dx.doi.org/10.1186/2041-1480-3-6
_version_ 1782476010855333888
author Samadian, Soroush
McManus, Bruce
Wilkinson, Mark D
author_facet Samadian, Soroush
McManus, Bruce
Wilkinson, Mark D
author_sort Samadian, Soroush
collection PubMed
description BACKGROUND: Clinical phenotypes and disease-risk stratification are most often determined through the direct observations of clinicians in conjunction with published standards and guidelines, where the clinical expert is the final arbiter of the patient’s classification. While this "human" approach is highly desirable in the context of personalized and optimal patient care, it is problematic in a healthcare research setting because the basis for the patient's classification is not transparent, and likely not reproducible from one clinical expert to another. This sits in opposition to the rigor required to execute, for example, Genome-wide association analyses and other high-throughput studies where a large number of variables are being compared to a complex disease phenotype. Most clinical classification systems and are not structured for automated classification, and similarly, clinical data is generally not represented in a form that lends itself to automated integration and interpretation. Here we apply Semantic Web technologies to the problem of automated, transparent interpretation of clinical data for use in high-throughput research environments, and explore migration-paths for existing data and legacy semantic standards. RESULTS: Using a dataset from a cardiovascular cohort collected two decades ago, we present a migration path - both for the terminologies/classification systems and the data - that enables rich automated clinical classification using well-established standards. This is achieved by establishing a simple and flexible core data model, which is combined with a layered ontological framework utilizing both logical reasoning and analytical algorithms to iteratively "lift" clinical data through increasingly complex layers of interpretation and classification. We compare our automated analysis to that of the clinical expert, and discrepancies are used to refine the ontological models, finally arriving at ontologies that mirror the expert opinion of the individual clinical researcher. Other discrepancies, however, could not be as easily modeled, and we evaluate what information we are lacking that would allow these discrepancies to be resolved in an automated manner. CONCLUSIONS: We demonstrate that the combination of semantically-explicit data, logically rigorous models of clinical guidelines, and publicly-accessible Semantic Web Services, can be used to execute automated, rigorous and reproducible clinical classifications with an accuracy approaching that of an expert. Discrepancies between the manual and automatic approaches reveal, as expected, that clinicians do not always rigorously follow established guidelines for classification; however, we demonstrate that "personalized" ontologies may represent a re-usable and transparent approach to modeling individual clinical expertise, leading to more reproducible science.
format Online
Article
Text
id pubmed-3639885
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36398852013-05-06 Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web Samadian, Soroush McManus, Bruce Wilkinson, Mark D J Biomed Semantics Research BACKGROUND: Clinical phenotypes and disease-risk stratification are most often determined through the direct observations of clinicians in conjunction with published standards and guidelines, where the clinical expert is the final arbiter of the patient’s classification. While this "human" approach is highly desirable in the context of personalized and optimal patient care, it is problematic in a healthcare research setting because the basis for the patient's classification is not transparent, and likely not reproducible from one clinical expert to another. This sits in opposition to the rigor required to execute, for example, Genome-wide association analyses and other high-throughput studies where a large number of variables are being compared to a complex disease phenotype. Most clinical classification systems and are not structured for automated classification, and similarly, clinical data is generally not represented in a form that lends itself to automated integration and interpretation. Here we apply Semantic Web technologies to the problem of automated, transparent interpretation of clinical data for use in high-throughput research environments, and explore migration-paths for existing data and legacy semantic standards. RESULTS: Using a dataset from a cardiovascular cohort collected two decades ago, we present a migration path - both for the terminologies/classification systems and the data - that enables rich automated clinical classification using well-established standards. This is achieved by establishing a simple and flexible core data model, which is combined with a layered ontological framework utilizing both logical reasoning and analytical algorithms to iteratively "lift" clinical data through increasingly complex layers of interpretation and classification. We compare our automated analysis to that of the clinical expert, and discrepancies are used to refine the ontological models, finally arriving at ontologies that mirror the expert opinion of the individual clinical researcher. Other discrepancies, however, could not be as easily modeled, and we evaluate what information we are lacking that would allow these discrepancies to be resolved in an automated manner. CONCLUSIONS: We demonstrate that the combination of semantically-explicit data, logically rigorous models of clinical guidelines, and publicly-accessible Semantic Web Services, can be used to execute automated, rigorous and reproducible clinical classifications with an accuracy approaching that of an expert. Discrepancies between the manual and automatic approaches reveal, as expected, that clinicians do not always rigorously follow established guidelines for classification; however, we demonstrate that "personalized" ontologies may represent a re-usable and transparent approach to modeling individual clinical expertise, leading to more reproducible science. BioMed Central 2012-07-20 /pmc/articles/PMC3639885/ /pubmed/22818710 http://dx.doi.org/10.1186/2041-1480-3-6 Text en Copyright © 2012 Samadian et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Samadian, Soroush
McManus, Bruce
Wilkinson, Mark D
Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title_full Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title_fullStr Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title_full_unstemmed Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title_short Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
title_sort extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639885/
https://www.ncbi.nlm.nih.gov/pubmed/22818710
http://dx.doi.org/10.1186/2041-1480-3-6
work_keys_str_mv AT samadiansoroush extendingandencodingexistingbiologicalterminologiesanddatasetsforuseinthereasonedsemanticweb
AT mcmanusbruce extendingandencodingexistingbiologicalterminologiesanddatasetsforuseinthereasonedsemanticweb
AT wilkinsonmarkd extendingandencodingexistingbiologicalterminologiesanddatasetsforuseinthereasonedsemanticweb