Cargando…

Improved ontology-based similarity calculations using a study-wise annotation model

A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile agai...

Descripción completa

Detalles Bibliográficos
Autor principal: Köhler, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868182/
https://www.ncbi.nlm.nih.gov/pubmed/29688377
http://dx.doi.org/10.1093/database/bay026
_version_ 1783309107313246208
author Köhler, Sebastian
author_facet Köhler, Sebastian
author_sort Köhler, Sebastian
collection PubMed
description A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile against gold-standard phenotype profiles of diseases or genes. The latter have long been constructed as flat lists of ontology classes, which, as we show in this work, can be improved by exploiting existing structure and information in annotation datasets or full text disease descriptions. We derive a study-wise annotation model of diseases and genes and show that this can improve the performance of semantic similarity measures. Inferred weights of individual annotations are one reason for this improvement, but more importantly using the study-wise structure further boosts the results of the algorithms according to precision-recall analyses. We test the study-wise annotation model for diseases annotated with classes from the HPO and for genes annotated with gene ontology (GO) classes. We incorporate this annotation model into similarity algorithms and show how this leads to improved performance. This work adds weight to the need for enhancing simple list-based representations of disease or gene annotations. We show how study-wise annotations can be automatically derived from full text summaries of disease descriptions and from the annotation data provided by the GO Consortium and how semantic similarity measure can utilize this extended annotation model. Database URL: https://phenomics.github.io/
format Online
Article
Text
id pubmed-5868182
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58681822018-03-29 Improved ontology-based similarity calculations using a study-wise annotation model Köhler, Sebastian Database (Oxford) Original Article A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile against gold-standard phenotype profiles of diseases or genes. The latter have long been constructed as flat lists of ontology classes, which, as we show in this work, can be improved by exploiting existing structure and information in annotation datasets or full text disease descriptions. We derive a study-wise annotation model of diseases and genes and show that this can improve the performance of semantic similarity measures. Inferred weights of individual annotations are one reason for this improvement, but more importantly using the study-wise structure further boosts the results of the algorithms according to precision-recall analyses. We test the study-wise annotation model for diseases annotated with classes from the HPO and for genes annotated with gene ontology (GO) classes. We incorporate this annotation model into similarity algorithms and show how this leads to improved performance. This work adds weight to the need for enhancing simple list-based representations of disease or gene annotations. We show how study-wise annotations can be automatically derived from full text summaries of disease descriptions and from the annotation data provided by the GO Consortium and how semantic similarity measure can utilize this extended annotation model. Database URL: https://phenomics.github.io/ Oxford University Press 2018-03-23 /pmc/articles/PMC5868182/ /pubmed/29688377 http://dx.doi.org/10.1093/database/bay026 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Köhler, Sebastian
Improved ontology-based similarity calculations using a study-wise annotation model
title Improved ontology-based similarity calculations using a study-wise annotation model
title_full Improved ontology-based similarity calculations using a study-wise annotation model
title_fullStr Improved ontology-based similarity calculations using a study-wise annotation model
title_full_unstemmed Improved ontology-based similarity calculations using a study-wise annotation model
title_short Improved ontology-based similarity calculations using a study-wise annotation model
title_sort improved ontology-based similarity calculations using a study-wise annotation model
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868182/
https://www.ncbi.nlm.nih.gov/pubmed/29688377
http://dx.doi.org/10.1093/database/bay026
work_keys_str_mv AT kohlersebastian improvedontologybasedsimilaritycalculationsusingastudywiseannotationmodel