Cargando…
Quality of word and concept embeddings in targetted biomedical domains
Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new ev...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272317/ https://www.ncbi.nlm.nih.gov/pubmed/37332929 http://dx.doi.org/10.1016/j.heliyon.2023.e16818 |
_version_ | 1785059466214899712 |
---|---|
author | Giancani, Salvatore Albertoni, Riccardo Catalano, Chiara Eva |
author_facet | Giancani, Salvatore Albertoni, Riccardo Catalano, Chiara Eva |
author_sort | Giancani, Salvatore |
collection | PubMed |
description | Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain. |
format | Online Article Text |
id | pubmed-10272317 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-102723172023-06-17 Quality of word and concept embeddings in targetted biomedical domains Giancani, Salvatore Albertoni, Riccardo Catalano, Chiara Eva Heliyon Research Article Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain. Elsevier 2023-06-02 /pmc/articles/PMC10272317/ /pubmed/37332929 http://dx.doi.org/10.1016/j.heliyon.2023.e16818 Text en © 2023 Published by Elsevier Ltd. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Giancani, Salvatore Albertoni, Riccardo Catalano, Chiara Eva Quality of word and concept embeddings in targetted biomedical domains |
title | Quality of word and concept embeddings in targetted biomedical domains |
title_full | Quality of word and concept embeddings in targetted biomedical domains |
title_fullStr | Quality of word and concept embeddings in targetted biomedical domains |
title_full_unstemmed | Quality of word and concept embeddings in targetted biomedical domains |
title_short | Quality of word and concept embeddings in targetted biomedical domains |
title_sort | quality of word and concept embeddings in targetted biomedical domains |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272317/ https://www.ncbi.nlm.nih.gov/pubmed/37332929 http://dx.doi.org/10.1016/j.heliyon.2023.e16818 |
work_keys_str_mv | AT giancanisalvatore qualityofwordandconceptembeddingsintargettedbiomedicaldomains AT albertoniriccardo qualityofwordandconceptembeddingsintargettedbiomedicaldomains AT catalanochiaraeva qualityofwordandconceptembeddingsintargettedbiomedicaldomains |