Cargando…

Quality of word and concept embeddings in targetted biomedical domains

Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new ev...

Descripción completa

Detalles Bibliográficos
Autores principales: Giancani, Salvatore, Albertoni, Riccardo, Catalano, Chiara Eva
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272317/
https://www.ncbi.nlm.nih.gov/pubmed/37332929
http://dx.doi.org/10.1016/j.heliyon.2023.e16818
_version_ 1785059466214899712
author Giancani, Salvatore
Albertoni, Riccardo
Catalano, Chiara Eva
author_facet Giancani, Salvatore
Albertoni, Riccardo
Catalano, Chiara Eva
author_sort Giancani, Salvatore
collection PubMed
description Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain.
format Online
Article
Text
id pubmed-10272317
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-102723172023-06-17 Quality of word and concept embeddings in targetted biomedical domains Giancani, Salvatore Albertoni, Riccardo Catalano, Chiara Eva Heliyon Research Article Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain. Elsevier 2023-06-02 /pmc/articles/PMC10272317/ /pubmed/37332929 http://dx.doi.org/10.1016/j.heliyon.2023.e16818 Text en © 2023 Published by Elsevier Ltd. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Giancani, Salvatore
Albertoni, Riccardo
Catalano, Chiara Eva
Quality of word and concept embeddings in targetted biomedical domains
title Quality of word and concept embeddings in targetted biomedical domains
title_full Quality of word and concept embeddings in targetted biomedical domains
title_fullStr Quality of word and concept embeddings in targetted biomedical domains
title_full_unstemmed Quality of word and concept embeddings in targetted biomedical domains
title_short Quality of word and concept embeddings in targetted biomedical domains
title_sort quality of word and concept embeddings in targetted biomedical domains
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272317/
https://www.ncbi.nlm.nih.gov/pubmed/37332929
http://dx.doi.org/10.1016/j.heliyon.2023.e16818
work_keys_str_mv AT giancanisalvatore qualityofwordandconceptembeddingsintargettedbiomedicaldomains
AT albertoniriccardo qualityofwordandconceptembeddingsintargettedbiomedicaldomains
AT catalanochiaraeva qualityofwordandconceptembeddingsintargettedbiomedicaldomains