Cargando…

NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories

Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF...

Descripción completa

Detalles Bibliográficos
Autores principales: Munarko, Yuda, Sarwar, Dewan M., Rampadarath, Anand, Atalag, Koray, Gennari, John H., Neal, Maxwell L., Nickerson, David P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908213/
https://www.ncbi.nlm.nih.gov/pubmed/35283794
http://dx.doi.org/10.3389/fphys.2022.820683
_version_ 1784665829843927040
author Munarko, Yuda
Sarwar, Dewan M.
Rampadarath, Anand
Atalag, Koray
Gennari, John H.
Neal, Maxwell L.
Nickerson, David P.
author_facet Munarko, Yuda
Sarwar, Dewan M.
Rampadarath, Anand
Atalag, Koray
Gennari, John H.
Neal, Maxwell L.
Nickerson, David P.
author_sort Munarko, Yuda
collection PubMed
description Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF provides the flexibility to enable searching for entities within models (e.g., variables, equations, or entire models) by utilizing the RDF query language SPARQL. However, the rigidity and complexity of the SPARQL syntax and the nature of the tree-like structure of semantic annotations, are challenging for users. Therefore, we propose NLIMED, an interface that converts natural language queries into SPARQL. We use this interface to query and discover model entities from repositories of biosimulation models. NLIMED works with the Physiome Model Repository (PMR) and the BioModels database and potentially other repositories annotated using RDF. Natural language queries are first “chunked” into phrases and annotated against ontology classes and predicates utilizing different natural language processing tools. Then, the ontology classes and predicates are composed as SPARQL and finally ranked using our SPARQL Composer and our indexing system. We demonstrate that NLIMED's approach for chunking and annotating queries is more effective than the NCBO Annotator for identifying relevant ontology classes in natural language queries.Comparison of NLIMED's behavior against historical query records in the PMR shows that it can adapt appropriately to queries associated with well-annotated models.
format Online
Article
Text
id pubmed-8908213
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89082132022-03-11 NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories Munarko, Yuda Sarwar, Dewan M. Rampadarath, Anand Atalag, Koray Gennari, John H. Neal, Maxwell L. Nickerson, David P. Front Physiol Physiology Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF provides the flexibility to enable searching for entities within models (e.g., variables, equations, or entire models) by utilizing the RDF query language SPARQL. However, the rigidity and complexity of the SPARQL syntax and the nature of the tree-like structure of semantic annotations, are challenging for users. Therefore, we propose NLIMED, an interface that converts natural language queries into SPARQL. We use this interface to query and discover model entities from repositories of biosimulation models. NLIMED works with the Physiome Model Repository (PMR) and the BioModels database and potentially other repositories annotated using RDF. Natural language queries are first “chunked” into phrases and annotated against ontology classes and predicates utilizing different natural language processing tools. Then, the ontology classes and predicates are composed as SPARQL and finally ranked using our SPARQL Composer and our indexing system. We demonstrate that NLIMED's approach for chunking and annotating queries is more effective than the NCBO Annotator for identifying relevant ontology classes in natural language queries.Comparison of NLIMED's behavior against historical query records in the PMR shows that it can adapt appropriately to queries associated with well-annotated models. Frontiers Media S.A. 2022-02-24 /pmc/articles/PMC8908213/ /pubmed/35283794 http://dx.doi.org/10.3389/fphys.2022.820683 Text en Copyright © 2022 Munarko, Sarwar, Rampadarath, Atalag, Gennari, Neal and Nickerson. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Physiology
Munarko, Yuda
Sarwar, Dewan M.
Rampadarath, Anand
Atalag, Koray
Gennari, John H.
Neal, Maxwell L.
Nickerson, David P.
NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title_full NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title_fullStr NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title_full_unstemmed NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title_short NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
title_sort nlimed: natural language interface for model entity discovery in biosimulation model repositories
topic Physiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908213/
https://www.ncbi.nlm.nih.gov/pubmed/35283794
http://dx.doi.org/10.3389/fphys.2022.820683
work_keys_str_mv AT munarkoyuda nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT sarwardewanm nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT rampadarathanand nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT atalagkoray nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT gennarijohnh nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT nealmaxwelll nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories
AT nickersondavidp nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories