Cargando…
NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories
Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908213/ https://www.ncbi.nlm.nih.gov/pubmed/35283794 http://dx.doi.org/10.3389/fphys.2022.820683 |
_version_ | 1784665829843927040 |
---|---|
author | Munarko, Yuda Sarwar, Dewan M. Rampadarath, Anand Atalag, Koray Gennari, John H. Neal, Maxwell L. Nickerson, David P. |
author_facet | Munarko, Yuda Sarwar, Dewan M. Rampadarath, Anand Atalag, Koray Gennari, John H. Neal, Maxwell L. Nickerson, David P. |
author_sort | Munarko, Yuda |
collection | PubMed |
description | Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF provides the flexibility to enable searching for entities within models (e.g., variables, equations, or entire models) by utilizing the RDF query language SPARQL. However, the rigidity and complexity of the SPARQL syntax and the nature of the tree-like structure of semantic annotations, are challenging for users. Therefore, we propose NLIMED, an interface that converts natural language queries into SPARQL. We use this interface to query and discover model entities from repositories of biosimulation models. NLIMED works with the Physiome Model Repository (PMR) and the BioModels database and potentially other repositories annotated using RDF. Natural language queries are first “chunked” into phrases and annotated against ontology classes and predicates utilizing different natural language processing tools. Then, the ontology classes and predicates are composed as SPARQL and finally ranked using our SPARQL Composer and our indexing system. We demonstrate that NLIMED's approach for chunking and annotating queries is more effective than the NCBO Annotator for identifying relevant ontology classes in natural language queries.Comparison of NLIMED's behavior against historical query records in the PMR shows that it can adapt appropriately to queries associated with well-annotated models. |
format | Online Article Text |
id | pubmed-8908213 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89082132022-03-11 NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories Munarko, Yuda Sarwar, Dewan M. Rampadarath, Anand Atalag, Koray Gennari, John H. Neal, Maxwell L. Nickerson, David P. Front Physiol Physiology Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF provides the flexibility to enable searching for entities within models (e.g., variables, equations, or entire models) by utilizing the RDF query language SPARQL. However, the rigidity and complexity of the SPARQL syntax and the nature of the tree-like structure of semantic annotations, are challenging for users. Therefore, we propose NLIMED, an interface that converts natural language queries into SPARQL. We use this interface to query and discover model entities from repositories of biosimulation models. NLIMED works with the Physiome Model Repository (PMR) and the BioModels database and potentially other repositories annotated using RDF. Natural language queries are first “chunked” into phrases and annotated against ontology classes and predicates utilizing different natural language processing tools. Then, the ontology classes and predicates are composed as SPARQL and finally ranked using our SPARQL Composer and our indexing system. We demonstrate that NLIMED's approach for chunking and annotating queries is more effective than the NCBO Annotator for identifying relevant ontology classes in natural language queries.Comparison of NLIMED's behavior against historical query records in the PMR shows that it can adapt appropriately to queries associated with well-annotated models. Frontiers Media S.A. 2022-02-24 /pmc/articles/PMC8908213/ /pubmed/35283794 http://dx.doi.org/10.3389/fphys.2022.820683 Text en Copyright © 2022 Munarko, Sarwar, Rampadarath, Atalag, Gennari, Neal and Nickerson. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Physiology Munarko, Yuda Sarwar, Dewan M. Rampadarath, Anand Atalag, Koray Gennari, John H. Neal, Maxwell L. Nickerson, David P. NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title | NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title_full | NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title_fullStr | NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title_full_unstemmed | NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title_short | NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories |
title_sort | nlimed: natural language interface for model entity discovery in biosimulation model repositories |
topic | Physiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8908213/ https://www.ncbi.nlm.nih.gov/pubmed/35283794 http://dx.doi.org/10.3389/fphys.2022.820683 |
work_keys_str_mv | AT munarkoyuda nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT sarwardewanm nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT rampadarathanand nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT atalagkoray nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT gennarijohnh nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT nealmaxwelll nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories AT nickersondavidp nlimednaturallanguageinterfaceformodelentitydiscoveryinbiosimulationmodelrepositories |