Cargando…

Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications

Word embedding models have recently shown some capability to encode hierarchical information that exists in textual data. However, such models do not explicitly encode the hierarchical structure that exists among words. In this work, we propose a method to learn hierarchical word embeddings (HWEs) i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Alsuhaibani, Mohammed, Bollegala, Danushka
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8610673/ https://www.ncbi.nlm.nih.gov/pubmed/34824601 http://dx.doi.org/10.1155/2021/9761163

_version_	1784603138932604928
author	Alsuhaibani, Mohammed Bollegala, Danushka
author_facet	Alsuhaibani, Mohammed Bollegala, Danushka
author_sort	Alsuhaibani, Mohammed
collection	PubMed
description	Word embedding models have recently shown some capability to encode hierarchical information that exists in textual data. However, such models do not explicitly encode the hierarchical structure that exists among words. In this work, we propose a method to learn hierarchical word embeddings (HWEs) in a specific order to encode the hierarchical information of a knowledge base (KB) in a vector space. To learn the word embeddings, our proposed method considers not only the hypernym relations that exist between words in a KB but also contextual information in a text corpus. The experimental results on various applications, such as supervised and unsupervised hypernymy detection, graded lexical entailment prediction, hierarchical path prediction, and word reconstruction tasks, show the ability of the proposed method to encode the hierarchy. Moreover, the proposed method outperforms previously proposed methods for learning nonspecialised, hypernym-specific, and hierarchical word embeddings on multiple benchmarks.
format	Online Article Text
id	pubmed-8610673
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-86106732021-11-24 Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications Alsuhaibani, Mohammed Bollegala, Danushka Comput Math Methods Med Research Article Word embedding models have recently shown some capability to encode hierarchical information that exists in textual data. However, such models do not explicitly encode the hierarchical structure that exists among words. In this work, we propose a method to learn hierarchical word embeddings (HWEs) in a specific order to encode the hierarchical information of a knowledge base (KB) in a vector space. To learn the word embeddings, our proposed method considers not only the hypernym relations that exist between words in a KB but also contextual information in a text corpus. The experimental results on various applications, such as supervised and unsupervised hypernymy detection, graded lexical entailment prediction, hierarchical path prediction, and word reconstruction tasks, show the ability of the proposed method to encode the hierarchy. Moreover, the proposed method outperforms previously proposed methods for learning nonspecialised, hypernym-specific, and hierarchical word embeddings on multiple benchmarks. Hindawi 2021-11-16 /pmc/articles/PMC8610673/ /pubmed/34824601 http://dx.doi.org/10.1155/2021/9761163 Text en Copyright © 2021 Mohammed Alsuhaibani and Danushka Bollegala. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Alsuhaibani, Mohammed Bollegala, Danushka Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title	Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title_full	Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title_fullStr	Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title_full_unstemmed	Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title_short	Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications
title_sort	fine-tuning word embeddings for hierarchical representation of data using a corpus and a knowledge base for various machine learning applications
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8610673/ https://www.ncbi.nlm.nih.gov/pubmed/34824601 http://dx.doi.org/10.1155/2021/9761163
work_keys_str_mv	AT alsuhaibanimohammed finetuningwordembeddingsforhierarchicalrepresentationofdatausingacorpusandaknowledgebaseforvariousmachinelearningapplications AT bollegaladanushka finetuningwordembeddingsforhierarchicalrepresentationofdatausingacorpusandaknowledgebaseforvariousmachinelearningapplications

Fine-Tuning Word Embeddings for Hierarchical Representation of Data Using a Corpus and a Knowledge Base for Various Machine Learning Applications

Ejemplares similares