Cargando…

Prediction and curation of missing biomedical identifier mappings with Biomappings

MOTIVATION: Biomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings between these entries is crucial for interoperability and the integration of data and k...

Descripción completa

Detalles Bibliográficos
Autores principales: Hoyt, Charles Tapley, Hoyt, Amelia L, Gyori, Benjamin M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10076045/
https://www.ncbi.nlm.nih.gov/pubmed/36916735
http://dx.doi.org/10.1093/bioinformatics/btad130
_version_ 1785020050589089792
author Hoyt, Charles Tapley
Hoyt, Amelia L
Gyori, Benjamin M
author_facet Hoyt, Charles Tapley
Hoyt, Amelia L
Gyori, Benjamin M
author_sort Hoyt, Charles Tapley
collection PubMed
description MOTIVATION: Biomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings between these entries is crucial for interoperability and the integration of data and knowledge. However, there are substantial gaps in available mappings motivating their semi-automated curation. RESULTS: Biomappings implements a curation workflow for missing mappings which combines automated prediction with human-in-the-loop curation. It supports multiple prediction approaches and provides a web-based user interface for reviewing predicted mappings for correctness, combined with automated consistency checking. Predicted and curated mappings are made available in public, version-controlled resource files on GitHub. Biomappings currently makes available 9274 curated mappings and 40 691 predicted ones, providing previously missing mappings between widely used identifier resources covering small molecules, cell lines, diseases, and other concepts. We demonstrate the value of Biomappings on case studies involving predicting and curating missing mappings among cancer cell lines as well as small molecules tested in clinical trials. We also present how previously missing mappings curated using Biomappings were contributed back to multiple widely used community ontologies. AVAILABILITY AND IMPLEMENTATION: The data and code are available under the CC0 and MIT licenses at https://github.com/biopragmatics/biomappings.
format Online
Article
Text
id pubmed-10076045
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-100760452023-04-06 Prediction and curation of missing biomedical identifier mappings with Biomappings Hoyt, Charles Tapley Hoyt, Amelia L Gyori, Benjamin M Bioinformatics Original Paper MOTIVATION: Biomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings between these entries is crucial for interoperability and the integration of data and knowledge. However, there are substantial gaps in available mappings motivating their semi-automated curation. RESULTS: Biomappings implements a curation workflow for missing mappings which combines automated prediction with human-in-the-loop curation. It supports multiple prediction approaches and provides a web-based user interface for reviewing predicted mappings for correctness, combined with automated consistency checking. Predicted and curated mappings are made available in public, version-controlled resource files on GitHub. Biomappings currently makes available 9274 curated mappings and 40 691 predicted ones, providing previously missing mappings between widely used identifier resources covering small molecules, cell lines, diseases, and other concepts. We demonstrate the value of Biomappings on case studies involving predicting and curating missing mappings among cancer cell lines as well as small molecules tested in clinical trials. We also present how previously missing mappings curated using Biomappings were contributed back to multiple widely used community ontologies. AVAILABILITY AND IMPLEMENTATION: The data and code are available under the CC0 and MIT licenses at https://github.com/biopragmatics/biomappings. Oxford University Press 2023-03-14 /pmc/articles/PMC10076045/ /pubmed/36916735 http://dx.doi.org/10.1093/bioinformatics/btad130 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Hoyt, Charles Tapley
Hoyt, Amelia L
Gyori, Benjamin M
Prediction and curation of missing biomedical identifier mappings with Biomappings
title Prediction and curation of missing biomedical identifier mappings with Biomappings
title_full Prediction and curation of missing biomedical identifier mappings with Biomappings
title_fullStr Prediction and curation of missing biomedical identifier mappings with Biomappings
title_full_unstemmed Prediction and curation of missing biomedical identifier mappings with Biomappings
title_short Prediction and curation of missing biomedical identifier mappings with Biomappings
title_sort prediction and curation of missing biomedical identifier mappings with biomappings
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10076045/
https://www.ncbi.nlm.nih.gov/pubmed/36916735
http://dx.doi.org/10.1093/bioinformatics/btad130
work_keys_str_mv AT hoytcharlestapley predictionandcurationofmissingbiomedicalidentifiermappingswithbiomappings
AT hoytamelial predictionandcurationofmissingbiomedicalidentifiermappingswithbiomappings
AT gyoribenjaminm predictionandcurationofmissingbiomedicalidentifiermappingswithbiomappings