Cargando…

DAKE: Document-Level Attention for Keyphrase Extraction

Keyphrases provide a concise representation of the topical content of a document and they are helpful in various downstream tasks. Previous approaches for keyphrase extraction model it as a sequence labelling task and use local contextual information to understand the semantics of the input text but...

Descripción completa

Detalles Bibliográficos
Autores principales:	Santosh, Tokala Yaswanth Sri Sai, Sanyal, Debarshi Kumar, Bhowmick, Plaban Kumar, Das, Partha Pratim
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148091/ http://dx.doi.org/10.1007/978-3-030-45442-5_49

_version_	1783520529778475008
author	Santosh, Tokala Yaswanth Sri Sai Sanyal, Debarshi Kumar Bhowmick, Plaban Kumar Das, Partha Pratim
author_facet	Santosh, Tokala Yaswanth Sri Sai Sanyal, Debarshi Kumar Bhowmick, Plaban Kumar Das, Partha Pratim
author_sort	Santosh, Tokala Yaswanth Sri Sai
collection	PubMed
description	Keyphrases provide a concise representation of the topical content of a document and they are helpful in various downstream tasks. Previous approaches for keyphrase extraction model it as a sequence labelling task and use local contextual information to understand the semantics of the input text but they fail when the local context is ambiguous or unclear. We present a new framework to improve keyphrase extraction by utilizing additional supporting contextual information. We retrieve this additional information from other sentences within the same document. To this end, we propose Document-level Attention for Keyphrase Extraction (DAKE), which comprises Bidirectional Long Short-Term Memory networks that capture hidden semantics in text, a document-level attention mechanism to incorporate document level contextual information, gating mechanisms which help to determine the influence of additional contextual information on the fusion with local contextual information, and Conditional Random Fields which capture output label dependencies. Our experimental results on a dataset of research papers show that the proposed model outperforms previous state-of-the-art approaches for keyphrase extraction.
format	Online Article Text
id	pubmed-7148091
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-71480912020-04-13 DAKE: Document-Level Attention for Keyphrase Extraction Santosh, Tokala Yaswanth Sri Sai Sanyal, Debarshi Kumar Bhowmick, Plaban Kumar Das, Partha Pratim Advances in Information Retrieval Article Keyphrases provide a concise representation of the topical content of a document and they are helpful in various downstream tasks. Previous approaches for keyphrase extraction model it as a sequence labelling task and use local contextual information to understand the semantics of the input text but they fail when the local context is ambiguous or unclear. We present a new framework to improve keyphrase extraction by utilizing additional supporting contextual information. We retrieve this additional information from other sentences within the same document. To this end, we propose Document-level Attention for Keyphrase Extraction (DAKE), which comprises Bidirectional Long Short-Term Memory networks that capture hidden semantics in text, a document-level attention mechanism to incorporate document level contextual information, gating mechanisms which help to determine the influence of additional contextual information on the fusion with local contextual information, and Conditional Random Fields which capture output label dependencies. Our experimental results on a dataset of research papers show that the proposed model outperforms previous state-of-the-art approaches for keyphrase extraction. 2020-03-24 /pmc/articles/PMC7148091/ http://dx.doi.org/10.1007/978-3-030-45442-5_49 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Santosh, Tokala Yaswanth Sri Sai Sanyal, Debarshi Kumar Bhowmick, Plaban Kumar Das, Partha Pratim DAKE: Document-Level Attention for Keyphrase Extraction
title	DAKE: Document-Level Attention for Keyphrase Extraction
title_full	DAKE: Document-Level Attention for Keyphrase Extraction
title_fullStr	DAKE: Document-Level Attention for Keyphrase Extraction
title_full_unstemmed	DAKE: Document-Level Attention for Keyphrase Extraction
title_short	DAKE: Document-Level Attention for Keyphrase Extraction
title_sort	dake: document-level attention for keyphrase extraction
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148091/ http://dx.doi.org/10.1007/978-3-030-45442-5_49
work_keys_str_mv	AT santoshtokalayaswanthsrisai dakedocumentlevelattentionforkeyphraseextraction AT sanyaldebarshikumar dakedocumentlevelattentionforkeyphraseextraction AT bhowmickplabankumar dakedocumentlevelattentionforkeyphraseextraction AT dasparthapratim dakedocumentlevelattentionforkeyphraseextraction

DAKE: Document-Level Attention for Keyphrase Extraction

Ejemplares similares