Cargando…

Performance assessment of ontology matching systems for FAIR data

BACKGROUND: Ontology matching should contribute to the interoperability aspect of FAIR data (Findable, Accessible, Interoperable, and Reusable). Multiple data sources can use different ontologies for annotating their data and, thus, creating the need for dynamic ontology matching services. In this e...

Descripción completa

Detalles Bibliográficos
Autores principales: van Damme, Philip, Fernández-Breis, Jesualdo Tomás, Benis, Nirupama, Miñarro-Gimenez, Jose Antonio, de Keizer, Nicolette F., Cornet, Ronald
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9284868/
https://www.ncbi.nlm.nih.gov/pubmed/35841031
http://dx.doi.org/10.1186/s13326-022-00273-5
_version_ 1784747658756227072
author van Damme, Philip
Fernández-Breis, Jesualdo Tomás
Benis, Nirupama
Miñarro-Gimenez, Jose Antonio
de Keizer, Nicolette F.
Cornet, Ronald
author_facet van Damme, Philip
Fernández-Breis, Jesualdo Tomás
Benis, Nirupama
Miñarro-Gimenez, Jose Antonio
de Keizer, Nicolette F.
Cornet, Ronald
author_sort van Damme, Philip
collection PubMed
description BACKGROUND: Ontology matching should contribute to the interoperability aspect of FAIR data (Findable, Accessible, Interoperable, and Reusable). Multiple data sources can use different ontologies for annotating their data and, thus, creating the need for dynamic ontology matching services. In this experimental study, we assessed the performance of ontology matching systems in the context of a real-life application from the rare disease domain. Additionally, we present a method for analyzing top-level classes to improve precision. RESULTS: We included three ontologies (NCIt, SNOMED CT, ORDO) and three matching systems (AgreementMakerLight 2.0, FCA-Map, LogMap 2.0). We evaluated the performance of the matching systems against reference alignments from BioPortal and the Unified Medical Language System Metathesaurus (UMLS). Then, we analyzed the top-level ancestors of matched classes, to detect incorrect mappings without consulting a reference alignment. To detect such incorrect mappings, we manually matched semantically equivalent top-level classes of ontology pairs. AgreementMakerLight 2.0, FCA-Map, and LogMap 2.0 had F1-scores of 0.55, 0.46, 0.55 for BioPortal and 0.66, 0.53, 0.58 for the UMLS respectively. Using vote-based consensus alignments increased performance across the board. Evaluation with manually created top-level hierarchy mappings revealed that on average 90% of the mappings’ classes belonged to top-level classes that matched. CONCLUSIONS: Our findings show that the included ontology matching systems automatically produced mappings that were modestly accurate according to our evaluation. The hierarchical analysis of mappings seems promising when no reference alignments are available. All in all, the systems show potential to be implemented as part of an ontology matching service for querying FAIR data. Future research should focus on developing methods for the evaluation of mappings used in such mapping services, leading to their implementation in a FAIR data ecosystem. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13326-022-00273-5).
format Online
Article
Text
id pubmed-9284868
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92848682022-07-16 Performance assessment of ontology matching systems for FAIR data van Damme, Philip Fernández-Breis, Jesualdo Tomás Benis, Nirupama Miñarro-Gimenez, Jose Antonio de Keizer, Nicolette F. Cornet, Ronald J Biomed Semantics Research BACKGROUND: Ontology matching should contribute to the interoperability aspect of FAIR data (Findable, Accessible, Interoperable, and Reusable). Multiple data sources can use different ontologies for annotating their data and, thus, creating the need for dynamic ontology matching services. In this experimental study, we assessed the performance of ontology matching systems in the context of a real-life application from the rare disease domain. Additionally, we present a method for analyzing top-level classes to improve precision. RESULTS: We included three ontologies (NCIt, SNOMED CT, ORDO) and three matching systems (AgreementMakerLight 2.0, FCA-Map, LogMap 2.0). We evaluated the performance of the matching systems against reference alignments from BioPortal and the Unified Medical Language System Metathesaurus (UMLS). Then, we analyzed the top-level ancestors of matched classes, to detect incorrect mappings without consulting a reference alignment. To detect such incorrect mappings, we manually matched semantically equivalent top-level classes of ontology pairs. AgreementMakerLight 2.0, FCA-Map, and LogMap 2.0 had F1-scores of 0.55, 0.46, 0.55 for BioPortal and 0.66, 0.53, 0.58 for the UMLS respectively. Using vote-based consensus alignments increased performance across the board. Evaluation with manually created top-level hierarchy mappings revealed that on average 90% of the mappings’ classes belonged to top-level classes that matched. CONCLUSIONS: Our findings show that the included ontology matching systems automatically produced mappings that were modestly accurate according to our evaluation. The hierarchical analysis of mappings seems promising when no reference alignments are available. All in all, the systems show potential to be implemented as part of an ontology matching service for querying FAIR data. Future research should focus on developing methods for the evaluation of mappings used in such mapping services, leading to their implementation in a FAIR data ecosystem. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13326-022-00273-5). BioMed Central 2022-07-15 /pmc/articles/PMC9284868/ /pubmed/35841031 http://dx.doi.org/10.1186/s13326-022-00273-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
van Damme, Philip
Fernández-Breis, Jesualdo Tomás
Benis, Nirupama
Miñarro-Gimenez, Jose Antonio
de Keizer, Nicolette F.
Cornet, Ronald
Performance assessment of ontology matching systems for FAIR data
title Performance assessment of ontology matching systems for FAIR data
title_full Performance assessment of ontology matching systems for FAIR data
title_fullStr Performance assessment of ontology matching systems for FAIR data
title_full_unstemmed Performance assessment of ontology matching systems for FAIR data
title_short Performance assessment of ontology matching systems for FAIR data
title_sort performance assessment of ontology matching systems for fair data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9284868/
https://www.ncbi.nlm.nih.gov/pubmed/35841031
http://dx.doi.org/10.1186/s13326-022-00273-5
work_keys_str_mv AT vandammephilip performanceassessmentofontologymatchingsystemsforfairdata
AT fernandezbreisjesualdotomas performanceassessmentofontologymatchingsystemsforfairdata
AT benisnirupama performanceassessmentofontologymatchingsystemsforfairdata
AT minarrogimenezjoseantonio performanceassessmentofontologymatchingsystemsforfairdata
AT dekeizernicolettef performanceassessmentofontologymatchingsystemsforfairdata
AT cornetronald performanceassessmentofontologymatchingsystemsforfairdata