Cargando…

Keyword Extraction: A Modern Perspective

The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP),...

Descripción completa

Detalles Bibliográficos
Autor principal: Nomoto, Tadashi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9753895/
https://www.ncbi.nlm.nih.gov/pubmed/36536753
http://dx.doi.org/10.1007/s42979-022-01481-7
_version_ 1784851068200419328
author Nomoto, Tadashi
author_facet Nomoto, Tadashi
author_sort Nomoto, Tadashi
collection PubMed
description The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP), and the emerging Neural paradigm. The 1990s have seen some early attempts to tackle the issue primarily based on text statistics [13, 17]. Meanwhile, in IR, efforts were largely led by DARPA’s Topic Detection and Tracking (TDT) project [2]. In this contribution, we discuss how past innovations paved a way for more recent developments, such as LDA, PageRank, and Neural Networks. We walk through the history of keyword extraction over the last 50 years, noting differences and similarities among methods that emerged during the time. We conduct a large meta-analysis of the past literature using datasets from news media, science, and medicine to business and bureaucracy, to draw a general picture of what a successful approach would look like.
format Online
Article
Text
id pubmed-9753895
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-97538952022-12-15 Keyword Extraction: A Modern Perspective Nomoto, Tadashi SN Comput Sci Survey Article The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP), and the emerging Neural paradigm. The 1990s have seen some early attempts to tackle the issue primarily based on text statistics [13, 17]. Meanwhile, in IR, efforts were largely led by DARPA’s Topic Detection and Tracking (TDT) project [2]. In this contribution, we discuss how past innovations paved a way for more recent developments, such as LDA, PageRank, and Neural Networks. We walk through the history of keyword extraction over the last 50 years, noting differences and similarities among methods that emerged during the time. We conduct a large meta-analysis of the past literature using datasets from news media, science, and medicine to business and bureaucracy, to draw a general picture of what a successful approach would look like. Springer Nature Singapore 2022-12-15 2023 /pmc/articles/PMC9753895/ /pubmed/36536753 http://dx.doi.org/10.1007/s42979-022-01481-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Survey Article
Nomoto, Tadashi
Keyword Extraction: A Modern Perspective
title Keyword Extraction: A Modern Perspective
title_full Keyword Extraction: A Modern Perspective
title_fullStr Keyword Extraction: A Modern Perspective
title_full_unstemmed Keyword Extraction: A Modern Perspective
title_short Keyword Extraction: A Modern Perspective
title_sort keyword extraction: a modern perspective
topic Survey Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9753895/
https://www.ncbi.nlm.nih.gov/pubmed/36536753
http://dx.doi.org/10.1007/s42979-022-01481-7
work_keys_str_mv AT nomototadashi keywordextractionamodernperspective