Cargando…
Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE)
The Transformer-based approaches to solving natural language processing (NLP) tasks such as BERT and GPT are gaining popularity due to their ability to achieve high performance. These approaches benefit from using enormous data sizes to create pre-trained models and the ability to understand the con...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10570691/ https://www.ncbi.nlm.nih.gov/pubmed/37842339 http://dx.doi.org/10.12688/f1000research.128982.1 |
_version_ | 1785119825538842624 |
---|---|
author | Munarko, Yuda Rampadarath, Anand Nickerson, David |
author_facet | Munarko, Yuda Rampadarath, Anand Nickerson, David |
author_sort | Munarko, Yuda |
collection | PubMed |
description | The Transformer-based approaches to solving natural language processing (NLP) tasks such as BERT and GPT are gaining popularity due to their ability to achieve high performance. These approaches benefit from using enormous data sizes to create pre-trained models and the ability to understand the context of words in a sentence. Their use in the information retrieval domain is thought to increase effectiveness and efficiency. This paper demonstrates a BERT-based method (CASBERT) implementation to build a search tool over data annotated compositely using ontologies. The data was a collection of biosimulation models written using the CellML standard in the Physiome Model Repository (PMR). A biosimulation model structurally consists of basic entities of constants and variables that construct higher-level entities such as components, reactions, and the model. Finding these entities specific to their level is beneficial for various purposes regarding variable reuse, experiment setup, and model audit. Initially, we created embeddings representing compositely-annotated entities for constant and variable search (lowest level entity). Then, these low-level entity embeddings were vertically and efficiently combined to create higher-level entity embeddings to search components, models, images, and simulation setups. Our approach was general, so it can be used to create search tools with other data semantically annotated with ontologies - biosimulation models encoded in the SBML format, for example. Our tool is named Biosimulation Model Search Engine (BMSE). |
format | Online Article Text |
id | pubmed-10570691 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-105706912023-10-14 Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) Munarko, Yuda Rampadarath, Anand Nickerson, David F1000Res Software Tool Article The Transformer-based approaches to solving natural language processing (NLP) tasks such as BERT and GPT are gaining popularity due to their ability to achieve high performance. These approaches benefit from using enormous data sizes to create pre-trained models and the ability to understand the context of words in a sentence. Their use in the information retrieval domain is thought to increase effectiveness and efficiency. This paper demonstrates a BERT-based method (CASBERT) implementation to build a search tool over data annotated compositely using ontologies. The data was a collection of biosimulation models written using the CellML standard in the Physiome Model Repository (PMR). A biosimulation model structurally consists of basic entities of constants and variables that construct higher-level entities such as components, reactions, and the model. Finding these entities specific to their level is beneficial for various purposes regarding variable reuse, experiment setup, and model audit. Initially, we created embeddings representing compositely-annotated entities for constant and variable search (lowest level entity). Then, these low-level entity embeddings were vertically and efficiently combined to create higher-level entity embeddings to search components, models, images, and simulation setups. Our approach was general, so it can be used to create search tools with other data semantically annotated with ontologies - biosimulation models encoded in the SBML format, for example. Our tool is named Biosimulation Model Search Engine (BMSE). F1000 Research Limited 2023-02-10 /pmc/articles/PMC10570691/ /pubmed/37842339 http://dx.doi.org/10.12688/f1000research.128982.1 Text en Copyright: © 2023 Munarko Y et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Munarko, Yuda Rampadarath, Anand Nickerson, David Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title | Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title_full | Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title_fullStr | Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title_full_unstemmed | Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title_short | Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE) |
title_sort | building a search tool for compositely annotated entities using transformer-based approach: case study in biosimulation model search engine (bmse) |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10570691/ https://www.ncbi.nlm.nih.gov/pubmed/37842339 http://dx.doi.org/10.12688/f1000research.128982.1 |
work_keys_str_mv | AT munarkoyuda buildingasearchtoolforcompositelyannotatedentitiesusingtransformerbasedapproachcasestudyinbiosimulationmodelsearchenginebmse AT rampadarathanand buildingasearchtoolforcompositelyannotatedentitiesusingtransformerbasedapproachcasestudyinbiosimulationmodelsearchenginebmse AT nickersondavid buildingasearchtoolforcompositelyannotatedentitiesusingtransformerbasedapproachcasestudyinbiosimulationmodelsearchenginebmse |