Cargando…
Keyword Extraction: A Modern Perspective
The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP),...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Nature Singapore
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9753895/ https://www.ncbi.nlm.nih.gov/pubmed/36536753 http://dx.doi.org/10.1007/s42979-022-01481-7 |
_version_ | 1784851068200419328 |
---|---|
author | Nomoto, Tadashi |
author_facet | Nomoto, Tadashi |
author_sort | Nomoto, Tadashi |
collection | PubMed |
description | The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP), and the emerging Neural paradigm. The 1990s have seen some early attempts to tackle the issue primarily based on text statistics [13, 17]. Meanwhile, in IR, efforts were largely led by DARPA’s Topic Detection and Tracking (TDT) project [2]. In this contribution, we discuss how past innovations paved a way for more recent developments, such as LDA, PageRank, and Neural Networks. We walk through the history of keyword extraction over the last 50 years, noting differences and similarities among methods that emerged during the time. We conduct a large meta-analysis of the past literature using datasets from news media, science, and medicine to business and bureaucracy, to draw a general picture of what a successful approach would look like. |
format | Online Article Text |
id | pubmed-9753895 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer Nature Singapore |
record_format | MEDLINE/PubMed |
spelling | pubmed-97538952022-12-15 Keyword Extraction: A Modern Perspective Nomoto, Tadashi SN Comput Sci Survey Article The goal of keyword extraction is to extract from a text, words, or phrases indicative of what it is talking about. In this work, we look at keyword extraction from a number of different perspectives: Statistics, Automatic Term Indexing, Information Retrieval (IR), Natural Language Processing (NLP), and the emerging Neural paradigm. The 1990s have seen some early attempts to tackle the issue primarily based on text statistics [13, 17]. Meanwhile, in IR, efforts were largely led by DARPA’s Topic Detection and Tracking (TDT) project [2]. In this contribution, we discuss how past innovations paved a way for more recent developments, such as LDA, PageRank, and Neural Networks. We walk through the history of keyword extraction over the last 50 years, noting differences and similarities among methods that emerged during the time. We conduct a large meta-analysis of the past literature using datasets from news media, science, and medicine to business and bureaucracy, to draw a general picture of what a successful approach would look like. Springer Nature Singapore 2022-12-15 2023 /pmc/articles/PMC9753895/ /pubmed/36536753 http://dx.doi.org/10.1007/s42979-022-01481-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Survey Article Nomoto, Tadashi Keyword Extraction: A Modern Perspective |
title | Keyword Extraction: A Modern Perspective |
title_full | Keyword Extraction: A Modern Perspective |
title_fullStr | Keyword Extraction: A Modern Perspective |
title_full_unstemmed | Keyword Extraction: A Modern Perspective |
title_short | Keyword Extraction: A Modern Perspective |
title_sort | keyword extraction: a modern perspective |
topic | Survey Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9753895/ https://www.ncbi.nlm.nih.gov/pubmed/36536753 http://dx.doi.org/10.1007/s42979-022-01481-7 |
work_keys_str_mv | AT nomototadashi keywordextractionamodernperspective |