Cargando…

Multilayer network based comparative document analysis (MUNCoDA)

The proposed multilayer network-based comparative document analysis (MUNCoDA) method supports the identification of the common points of a set of documents, which deal with the same subject area. As documents are transformed into networks of informative word-pairs, the collection of documents form a...

Descripción completa

Detalles Bibliográficos
Autores principales: Sebestyén, Viktor, Domokos, Endre, Abonyi, János
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7226890/
https://www.ncbi.nlm.nih.gov/pubmed/32426247
http://dx.doi.org/10.1016/j.mex.2020.100902
Descripción
Sumario:The proposed multilayer network-based comparative document analysis (MUNCoDA) method supports the identification of the common points of a set of documents, which deal with the same subject area. As documents are transformed into networks of informative word-pairs, the collection of documents form a multilayer network that allows the comparative evaluation of the texts. The multilayer network can be visualized and analyzed to highlight how the texts are structured. The topics of the documents can be clustered based on the developed similarity measures. By exploring the network centralities, topic importance values can be assigned. The method is fully automated by KNIME preprocessing tools and MATLAB/Octave code. • Networks can be formed based on informative word pairs of a multiple documents; • The analysis of the proposed multilayer networks provides information for multi-document summarization; • Words and documents can be clustered based on node similarity and edge overlap measures.