Cargando…
Improved ontology-based similarity calculations using a study-wise annotation model
A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile agai...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868182/ https://www.ncbi.nlm.nih.gov/pubmed/29688377 http://dx.doi.org/10.1093/database/bay026 |
_version_ | 1783309107313246208 |
---|---|
author | Köhler, Sebastian |
author_facet | Köhler, Sebastian |
author_sort | Köhler, Sebastian |
collection | PubMed |
description | A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile against gold-standard phenotype profiles of diseases or genes. The latter have long been constructed as flat lists of ontology classes, which, as we show in this work, can be improved by exploiting existing structure and information in annotation datasets or full text disease descriptions. We derive a study-wise annotation model of diseases and genes and show that this can improve the performance of semantic similarity measures. Inferred weights of individual annotations are one reason for this improvement, but more importantly using the study-wise structure further boosts the results of the algorithms according to precision-recall analyses. We test the study-wise annotation model for diseases annotated with classes from the HPO and for genes annotated with gene ontology (GO) classes. We incorporate this annotation model into similarity algorithms and show how this leads to improved performance. This work adds weight to the need for enhancing simple list-based representations of disease or gene annotations. We show how study-wise annotations can be automatically derived from full text summaries of disease descriptions and from the annotation data provided by the GO Consortium and how semantic similarity measure can utilize this extended annotation model. Database URL: https://phenomics.github.io/ |
format | Online Article Text |
id | pubmed-5868182 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58681822018-03-29 Improved ontology-based similarity calculations using a study-wise annotation model Köhler, Sebastian Database (Oxford) Original Article A typical use case of ontologies is the calculation of similarity scores between items that are annotated with classes of the ontology. For example, in differential diagnostics and disease gene prioritization, the human phenotype ontology (HPO) is often used to compare a query phenotype profile against gold-standard phenotype profiles of diseases or genes. The latter have long been constructed as flat lists of ontology classes, which, as we show in this work, can be improved by exploiting existing structure and information in annotation datasets or full text disease descriptions. We derive a study-wise annotation model of diseases and genes and show that this can improve the performance of semantic similarity measures. Inferred weights of individual annotations are one reason for this improvement, but more importantly using the study-wise structure further boosts the results of the algorithms according to precision-recall analyses. We test the study-wise annotation model for diseases annotated with classes from the HPO and for genes annotated with gene ontology (GO) classes. We incorporate this annotation model into similarity algorithms and show how this leads to improved performance. This work adds weight to the need for enhancing simple list-based representations of disease or gene annotations. We show how study-wise annotations can be automatically derived from full text summaries of disease descriptions and from the annotation data provided by the GO Consortium and how semantic similarity measure can utilize this extended annotation model. Database URL: https://phenomics.github.io/ Oxford University Press 2018-03-23 /pmc/articles/PMC5868182/ /pubmed/29688377 http://dx.doi.org/10.1093/database/bay026 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Köhler, Sebastian Improved ontology-based similarity calculations using a study-wise annotation model |
title | Improved ontology-based similarity calculations using a study-wise annotation model |
title_full | Improved ontology-based similarity calculations using a study-wise annotation model |
title_fullStr | Improved ontology-based similarity calculations using a study-wise annotation model |
title_full_unstemmed | Improved ontology-based similarity calculations using a study-wise annotation model |
title_short | Improved ontology-based similarity calculations using a study-wise annotation model |
title_sort | improved ontology-based similarity calculations using a study-wise annotation model |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5868182/ https://www.ncbi.nlm.nih.gov/pubmed/29688377 http://dx.doi.org/10.1093/database/bay026 |
work_keys_str_mv | AT kohlersebastian improvedontologybasedsimilaritycalculationsusingastudywiseannotationmodel |