Cargando…

A Complex Network Approach to Stylometry

Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limi...

Descripción completa

Detalles Bibliográficos
Autor principal: Amancio, Diego Raphael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4552030/
https://www.ncbi.nlm.nih.gov/pubmed/26313921
http://dx.doi.org/10.1371/journal.pone.0136076
_version_ 1782387671829577728
author Amancio, Diego Raphael
author_facet Amancio, Diego Raphael
author_sort Amancio, Diego Raphael
collection PubMed
description Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents.
format Online
Article
Text
id pubmed-4552030
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-45520302015-09-01 A Complex Network Approach to Stylometry Amancio, Diego Raphael PLoS One Research Article Statistical methods have been widely employed to study the fundamental properties of language. In recent years, methods from complex and dynamical systems proved useful to create several language models. Despite the large amount of studies devoted to represent texts with physical models, only a limited number of studies have shown how the properties of the underlying physical systems can be employed to improve the performance of natural language processing tasks. In this paper, I address this problem by devising complex networks methods that are able to improve the performance of current statistical methods. Using a fuzzy classification strategy, I show that the topological properties extracted from texts complement the traditional textual description. In several cases, the performance obtained with hybrid approaches outperformed the results obtained when only traditional or networked methods were used. Because the proposed model is generic, the framework devised here could be straightforwardly used to study similar textual applications where the topology plays a pivotal role in the description of the interacting agents. Public Library of Science 2015-08-27 /pmc/articles/PMC4552030/ /pubmed/26313921 http://dx.doi.org/10.1371/journal.pone.0136076 Text en © 2015 Diego Raphael Amancio http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Amancio, Diego Raphael
A Complex Network Approach to Stylometry
title A Complex Network Approach to Stylometry
title_full A Complex Network Approach to Stylometry
title_fullStr A Complex Network Approach to Stylometry
title_full_unstemmed A Complex Network Approach to Stylometry
title_short A Complex Network Approach to Stylometry
title_sort complex network approach to stylometry
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4552030/
https://www.ncbi.nlm.nih.gov/pubmed/26313921
http://dx.doi.org/10.1371/journal.pone.0136076
work_keys_str_mv AT amanciodiegoraphael acomplexnetworkapproachtostylometry
AT amanciodiegoraphael complexnetworkapproachtostylometry