Cargando…

An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents

This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms a...

Descripción completa

Detalles Bibliográficos
Autores principales: Giarelis, Nikolaos, Kanakaris, Nikos, Karacapilidis, Nikos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7256382/
http://dx.doi.org/10.1007/978-3-030-49161-1_9
_version_ 1783539895756652544
author Giarelis, Nikolaos
Kanakaris, Nikos
Karacapilidis, Nikos
author_facet Giarelis, Nikolaos
Kanakaris, Nikos
Karacapilidis, Nikos
author_sort Giarelis, Nikolaos
collection PubMed
description This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms and node centrality measures. Compared to well-tried existing solutions, evaluation results show that the proposed approach increases the accuracy of most text classifiers employed and decreases the number of features required to achieve ‘state-of-the-art’ accuracy. Well-known datasets used for the experimentations reported in this paper include 20Newsgroups, LingSpam, Amazon Reviews and Reuters.
format Online
Article
Text
id pubmed-7256382
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72563822020-05-29 An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents Giarelis, Nikolaos Kanakaris, Nikos Karacapilidis, Nikos Artificial Intelligence Applications and Innovations Article This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms and node centrality measures. Compared to well-tried existing solutions, evaluation results show that the proposed approach increases the accuracy of most text classifiers employed and decreases the number of features required to achieve ‘state-of-the-art’ accuracy. Well-known datasets used for the experimentations reported in this paper include 20Newsgroups, LingSpam, Amazon Reviews and Reuters. 2020-05-06 /pmc/articles/PMC7256382/ http://dx.doi.org/10.1007/978-3-030-49161-1_9 Text en © IFIP International Federation for Information Processing 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Giarelis, Nikolaos
Kanakaris, Nikos
Karacapilidis, Nikos
An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title_full An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title_fullStr An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title_full_unstemmed An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title_short An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents
title_sort innovative graph-based approach to advance feature selection from multiple textual documents
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7256382/
http://dx.doi.org/10.1007/978-3-030-49161-1_9
work_keys_str_mv AT giarelisnikolaos aninnovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments
AT kanakarisnikos aninnovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments
AT karacapilidisnikos aninnovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments
AT giarelisnikolaos innovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments
AT kanakarisnikos innovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments
AT karacapilidisnikos innovativegraphbasedapproachtoadvancefeatureselectionfrommultipletextualdocuments