Cargando…

Comparing the Hierarchy of Keywords in On-Line News Portals

Hierarchical organization is prevalent in networks representing a wide range of systems in nature and society. An important example is given by the tag hierarchies extracted from large on-line data repositories such as scientific publication archives, file sharing portals, blogs, on-line news portal...

Descripción completa

Detalles Bibliográficos
Autores principales: Tibély, Gergely, Sousa-Rodrigues, David, Pollner, Péter, Palla, Gergely
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5089747/
https://www.ncbi.nlm.nih.gov/pubmed/27802319
http://dx.doi.org/10.1371/journal.pone.0165728
_version_ 1782464293152751616
author Tibély, Gergely
Sousa-Rodrigues, David
Pollner, Péter
Palla, Gergely
author_facet Tibély, Gergely
Sousa-Rodrigues, David
Pollner, Péter
Palla, Gergely
author_sort Tibély, Gergely
collection PubMed
description Hierarchical organization is prevalent in networks representing a wide range of systems in nature and society. An important example is given by the tag hierarchies extracted from large on-line data repositories such as scientific publication archives, file sharing portals, blogs, on-line news portals, etc. The tagging of the stored objects with informative keywords in such repositories has become very common, and in most cases the tags on a given item are free words chosen by the authors independently. Therefore, the relations among keywords appearing in an on-line data repository are unknown in general. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialized ones at the bottom. There are several algorithms available for deducing this hierarchy from the statistical features of the keywords. In the present work we apply a recent, co-occurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorized low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals.
format Online
Article
Text
id pubmed-5089747
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-50897472016-11-15 Comparing the Hierarchy of Keywords in On-Line News Portals Tibély, Gergely Sousa-Rodrigues, David Pollner, Péter Palla, Gergely PLoS One Research Article Hierarchical organization is prevalent in networks representing a wide range of systems in nature and society. An important example is given by the tag hierarchies extracted from large on-line data repositories such as scientific publication archives, file sharing portals, blogs, on-line news portals, etc. The tagging of the stored objects with informative keywords in such repositories has become very common, and in most cases the tags on a given item are free words chosen by the authors independently. Therefore, the relations among keywords appearing in an on-line data repository are unknown in general. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialized ones at the bottom. There are several algorithms available for deducing this hierarchy from the statistical features of the keywords. In the present work we apply a recent, co-occurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorized low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals. Public Library of Science 2016-11-01 /pmc/articles/PMC5089747/ /pubmed/27802319 http://dx.doi.org/10.1371/journal.pone.0165728 Text en © 2016 Tibély et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tibély, Gergely
Sousa-Rodrigues, David
Pollner, Péter
Palla, Gergely
Comparing the Hierarchy of Keywords in On-Line News Portals
title Comparing the Hierarchy of Keywords in On-Line News Portals
title_full Comparing the Hierarchy of Keywords in On-Line News Portals
title_fullStr Comparing the Hierarchy of Keywords in On-Line News Portals
title_full_unstemmed Comparing the Hierarchy of Keywords in On-Line News Portals
title_short Comparing the Hierarchy of Keywords in On-Line News Portals
title_sort comparing the hierarchy of keywords in on-line news portals
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5089747/
https://www.ncbi.nlm.nih.gov/pubmed/27802319
http://dx.doi.org/10.1371/journal.pone.0165728
work_keys_str_mv AT tibelygergely comparingthehierarchyofkeywordsinonlinenewsportals
AT sousarodriguesdavid comparingthehierarchyofkeywordsinonlinenewsportals
AT pollnerpeter comparingthehierarchyofkeywordsinonlinenewsportals
AT pallagergely comparingthehierarchyofkeywordsinonlinenewsportals