Cargando…

Tackling the challenges of matching biomedical ontologies

BACKGROUND: Biomedical ontologies pose several challenges to ontology matching due both to the complexity of the biomedical domain and to the characteristics of the ontologies themselves. The biomedical tracks in the Ontology Matching Evaluation Initiative (OAEI) have spurred the development of matc...

Descripción completa

Detalles Bibliográficos
Autores principales: Faria, Daniel, Pesquita, Catia, Mott, Isabela, Martins, Catarina, Couto, Francisco M., Cruz, Isabel F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5769431/
https://www.ncbi.nlm.nih.gov/pubmed/29335022
http://dx.doi.org/10.1186/s13326-017-0170-9
_version_ 1783292897042366464
author Faria, Daniel
Pesquita, Catia
Mott, Isabela
Martins, Catarina
Couto, Francisco M.
Cruz, Isabel F.
author_facet Faria, Daniel
Pesquita, Catia
Mott, Isabela
Martins, Catarina
Couto, Francisco M.
Cruz, Isabel F.
author_sort Faria, Daniel
collection PubMed
description BACKGROUND: Biomedical ontologies pose several challenges to ontology matching due both to the complexity of the biomedical domain and to the characteristics of the ontologies themselves. The biomedical tracks in the Ontology Matching Evaluation Initiative (OAEI) have spurred the development of matching systems able to tackle these challenges, and benchmarked their general performance. In this study, we dissect the strategies employed by matching systems to tackle the challenges of matching biomedical ontologies and gauge the impact of the challenges themselves on matching performance, using the AgreementMakerLight (AML) system as the platform for this study. RESULTS: We demonstrate that the linear complexity of the hash-based searching strategy implemented by most state-of-the-art ontology matching systems is essential for matching large biomedical ontologies efficiently. We show that accounting for all lexical annotations (e.g., labels and synonyms) in biomedical ontologies leads to a substantial improvement in F-measure over using only the primary name, and that accounting for the reliability of different types of annotations generally also leads to a marked improvement. Finally, we show that cross-references are a reliable source of information and that, when using biomedical ontologies as background knowledge, it is generally more reliable to use them as mediators than to perform lexical expansion. CONCLUSIONS: We anticipate that translating traditional matching algorithms to the hash-based searching paradigm will be a critical direction for the future development of the field. Improving the evaluation carried out in the biomedical tracks of the OAEI will also be important, as without proper reference alignments there is only so much that can be ascertained about matching systems or strategies. Nevertheless, it is clear that, to tackle the various challenges posed by biomedical ontologies, ontology matching systems must be able to efficiently combine multiple strategies into a mature matching approach. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0170-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5769431
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57694312018-01-25 Tackling the challenges of matching biomedical ontologies Faria, Daniel Pesquita, Catia Mott, Isabela Martins, Catarina Couto, Francisco M. Cruz, Isabel F. J Biomed Semantics Research BACKGROUND: Biomedical ontologies pose several challenges to ontology matching due both to the complexity of the biomedical domain and to the characteristics of the ontologies themselves. The biomedical tracks in the Ontology Matching Evaluation Initiative (OAEI) have spurred the development of matching systems able to tackle these challenges, and benchmarked their general performance. In this study, we dissect the strategies employed by matching systems to tackle the challenges of matching biomedical ontologies and gauge the impact of the challenges themselves on matching performance, using the AgreementMakerLight (AML) system as the platform for this study. RESULTS: We demonstrate that the linear complexity of the hash-based searching strategy implemented by most state-of-the-art ontology matching systems is essential for matching large biomedical ontologies efficiently. We show that accounting for all lexical annotations (e.g., labels and synonyms) in biomedical ontologies leads to a substantial improvement in F-measure over using only the primary name, and that accounting for the reliability of different types of annotations generally also leads to a marked improvement. Finally, we show that cross-references are a reliable source of information and that, when using biomedical ontologies as background knowledge, it is generally more reliable to use them as mediators than to perform lexical expansion. CONCLUSIONS: We anticipate that translating traditional matching algorithms to the hash-based searching paradigm will be a critical direction for the future development of the field. Improving the evaluation carried out in the biomedical tracks of the OAEI will also be important, as without proper reference alignments there is only so much that can be ascertained about matching systems or strategies. Nevertheless, it is clear that, to tackle the various challenges posed by biomedical ontologies, ontology matching systems must be able to efficiently combine multiple strategies into a mature matching approach. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-017-0170-9) contains supplementary material, which is available to authorized users. BioMed Central 2018-01-15 /pmc/articles/PMC5769431/ /pubmed/29335022 http://dx.doi.org/10.1186/s13326-017-0170-9 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Faria, Daniel
Pesquita, Catia
Mott, Isabela
Martins, Catarina
Couto, Francisco M.
Cruz, Isabel F.
Tackling the challenges of matching biomedical ontologies
title Tackling the challenges of matching biomedical ontologies
title_full Tackling the challenges of matching biomedical ontologies
title_fullStr Tackling the challenges of matching biomedical ontologies
title_full_unstemmed Tackling the challenges of matching biomedical ontologies
title_short Tackling the challenges of matching biomedical ontologies
title_sort tackling the challenges of matching biomedical ontologies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5769431/
https://www.ncbi.nlm.nih.gov/pubmed/29335022
http://dx.doi.org/10.1186/s13326-017-0170-9
work_keys_str_mv AT fariadaniel tacklingthechallengesofmatchingbiomedicalontologies
AT pesquitacatia tacklingthechallengesofmatchingbiomedicalontologies
AT mottisabela tacklingthechallengesofmatchingbiomedicalontologies
AT martinscatarina tacklingthechallengesofmatchingbiomedicalontologies
AT coutofranciscom tacklingthechallengesofmatchingbiomedicalontologies
AT cruzisabelf tacklingthechallengesofmatchingbiomedicalontologies