Cargando…

An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems

In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 wor...

Descripción completa

Detalles Bibliográficos
Autores principales: Arefin, Ahmed Shamsul, Vimieiro, Renato, Riveros, Carlos, Craig, Hugh, Moscato, Pablo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4210181/
https://www.ncbi.nlm.nih.gov/pubmed/25347727
http://dx.doi.org/10.1371/journal.pone.0111445
_version_ 1782341339054080000
author Arefin, Ahmed Shamsul
Vimieiro, Renato
Riveros, Carlos
Craig, Hugh
Moscato, Pablo
author_facet Arefin, Ahmed Shamsul
Vimieiro, Renato
Riveros, Carlos
Craig, Hugh
Moscato, Pablo
author_sort Arefin, Ahmed Shamsul
collection PubMed
description In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed.
format Online
Article
Text
id pubmed-4210181
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42101812014-10-30 An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems Arefin, Ahmed Shamsul Vimieiro, Renato Riveros, Carlos Craig, Hugh Moscato, Pablo PLoS One Research Article In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed. Public Library of Science 2014-10-27 /pmc/articles/PMC4210181/ /pubmed/25347727 http://dx.doi.org/10.1371/journal.pone.0111445 Text en © 2014 Arefin et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Arefin, Ahmed Shamsul
Vimieiro, Renato
Riveros, Carlos
Craig, Hugh
Moscato, Pablo
An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title_full An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title_fullStr An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title_full_unstemmed An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title_short An Information Theoretic Clustering Approach for Unveiling Authorship Affinities in Shakespearean Era Plays and Poems
title_sort information theoretic clustering approach for unveiling authorship affinities in shakespearean era plays and poems
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4210181/
https://www.ncbi.nlm.nih.gov/pubmed/25347727
http://dx.doi.org/10.1371/journal.pone.0111445
work_keys_str_mv AT arefinahmedshamsul aninformationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT vimieirorenato aninformationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT riveroscarlos aninformationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT craighugh aninformationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT moscatopablo aninformationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT arefinahmedshamsul informationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT vimieirorenato informationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT riveroscarlos informationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT craighugh informationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems
AT moscatopablo informationtheoreticclusteringapproachforunveilingauthorshipaffinitiesinshakespeareaneraplaysandpoems