Cargando…

A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges

The transmission from offline activities to online activities due to the social disorder evolved from COVID-19 pandemic lockdown has led to increase in the online economic and social activities. In this regard, the Automatic Keyword Extraction (AKE) from textual data has become even more interesting...

Descripción completa

Detalles Bibliográficos
Autor principal: Garg, Muskan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062621/
https://www.ncbi.nlm.nih.gov/pubmed/33907346
http://dx.doi.org/10.1007/s10462-021-10010-6
_version_ 1783681802477502464
author Garg, Muskan
author_facet Garg, Muskan
author_sort Garg, Muskan
collection PubMed
description The transmission from offline activities to online activities due to the social disorder evolved from COVID-19 pandemic lockdown has led to increase in the online economic and social activities. In this regard, the Automatic Keyword Extraction (AKE) from textual data has become even more interesting due to its application over different domains of Natural Language Processing (NLP). It is observed that the Graphical Keyword Extraction Techniques (GKET) use Graph of Words (GoW) in literature for analysis in different dimensions. In this article, efforts have been made to study these different dimensions for GKET, namely, the GoW representation, the statistical properties of GoW, the stability of the structure of GoW, the diversity in approaches over GoW for GKET, and the ranking of nodes in GoW. To elucidate these different dimensions, a comprehensive survey of GKET is carried in different domains to make some inferences out of the existing literature. These inferences are used to lay down possible research directions for interdisciplinary studies of network science and NLP. In addition, the experimental results are analysed to compare and contrast the existing GKET over 21 different dataset, to analyse the Word Co-occurrence Networks (WCN) for 15 different languages, and to study the structure of WCN for different genres. In this article, some strong correspondences in different disciplinary approaches are identified for different dimensions, namely, GoW representation: ’Line Graphs’ and ’Bigram Words Graphs’; Feature extraction and selection using eigenvalues: ’Random Walk’ and ’Spectral Clustering’. Different observations over the need to integrate multiple dimensions has open new research directions in the inter-disciplinary field of network science and NLP, applicable to handle streaming data and language-independent NLP.
format Online
Article
Text
id pubmed-8062621
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-80626212021-04-23 A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges Garg, Muskan Artif Intell Rev Article The transmission from offline activities to online activities due to the social disorder evolved from COVID-19 pandemic lockdown has led to increase in the online economic and social activities. In this regard, the Automatic Keyword Extraction (AKE) from textual data has become even more interesting due to its application over different domains of Natural Language Processing (NLP). It is observed that the Graphical Keyword Extraction Techniques (GKET) use Graph of Words (GoW) in literature for analysis in different dimensions. In this article, efforts have been made to study these different dimensions for GKET, namely, the GoW representation, the statistical properties of GoW, the stability of the structure of GoW, the diversity in approaches over GoW for GKET, and the ranking of nodes in GoW. To elucidate these different dimensions, a comprehensive survey of GKET is carried in different domains to make some inferences out of the existing literature. These inferences are used to lay down possible research directions for interdisciplinary studies of network science and NLP. In addition, the experimental results are analysed to compare and contrast the existing GKET over 21 different dataset, to analyse the Word Co-occurrence Networks (WCN) for 15 different languages, and to study the structure of WCN for different genres. In this article, some strong correspondences in different disciplinary approaches are identified for different dimensions, namely, GoW representation: ’Line Graphs’ and ’Bigram Words Graphs’; Feature extraction and selection using eigenvalues: ’Random Walk’ and ’Spectral Clustering’. Different observations over the need to integrate multiple dimensions has open new research directions in the inter-disciplinary field of network science and NLP, applicable to handle streaming data and language-independent NLP. Springer Netherlands 2021-04-23 2021 /pmc/articles/PMC8062621/ /pubmed/33907346 http://dx.doi.org/10.1007/s10462-021-10010-6 Text en © The Author(s), under exclusive licence to Springer Nature B.V. 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Garg, Muskan
A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title_full A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title_fullStr A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title_full_unstemmed A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title_short A survey on different dimensions for graphical keyword extraction techniques: Issues and Challenges
title_sort survey on different dimensions for graphical keyword extraction techniques: issues and challenges
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062621/
https://www.ncbi.nlm.nih.gov/pubmed/33907346
http://dx.doi.org/10.1007/s10462-021-10010-6
work_keys_str_mv AT gargmuskan asurveyondifferentdimensionsforgraphicalkeywordextractiontechniquesissuesandchallenges
AT gargmuskan surveyondifferentdimensionsforgraphicalkeywordextractiontechniquesissuesandchallenges