Cargando…

Clickstream Data Yields High-Resolution Maps of Science

BACKGROUND: Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations com...

Descripción completa

Detalles Bibliográficos
Autores principales: Bollen, Johan, Van de Sompel, Herbert, Hagberg, Aric, Bettencourt, Luis, Chute, Ryan, Rodriguez, Marko A., Balakireva, Lyudmila
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2652715/
https://www.ncbi.nlm.nih.gov/pubmed/19277205
http://dx.doi.org/10.1371/journal.pone.0004803
_version_ 1782165251715760128
author Bollen, Johan
Van de Sompel, Herbert
Hagberg, Aric
Bettencourt, Luis
Chute, Ryan
Rodriguez, Marko A.
Balakireva, Lyudmila
author_facet Bollen, Johan
Van de Sompel, Herbert
Hagberg, Aric
Bettencourt, Luis
Chute, Ryan
Rodriguez, Marko A.
Balakireva, Lyudmila
author_sort Bollen, Johan
collection PubMed
description BACKGROUND: Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science. METHODOLOGY: Over the course of 2007 and 2008, we collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e. a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus. The resulting model was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences. CONCLUSIONS: Maps of science resulting from large-scale clickstream data provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data.
format Text
id pubmed-2652715
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-26527152009-03-11 Clickstream Data Yields High-Resolution Maps of Science Bollen, Johan Van de Sompel, Herbert Hagberg, Aric Bettencourt, Luis Chute, Ryan Rodriguez, Marko A. Balakireva, Lyudmila PLoS One Research Article BACKGROUND: Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science. METHODOLOGY: Over the course of 2007 and 2008, we collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e. a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus. The resulting model was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences. CONCLUSIONS: Maps of science resulting from large-scale clickstream data provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data. Public Library of Science 2009-03-11 /pmc/articles/PMC2652715/ /pubmed/19277205 http://dx.doi.org/10.1371/journal.pone.0004803 Text en This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Bollen, Johan
Van de Sompel, Herbert
Hagberg, Aric
Bettencourt, Luis
Chute, Ryan
Rodriguez, Marko A.
Balakireva, Lyudmila
Clickstream Data Yields High-Resolution Maps of Science
title Clickstream Data Yields High-Resolution Maps of Science
title_full Clickstream Data Yields High-Resolution Maps of Science
title_fullStr Clickstream Data Yields High-Resolution Maps of Science
title_full_unstemmed Clickstream Data Yields High-Resolution Maps of Science
title_short Clickstream Data Yields High-Resolution Maps of Science
title_sort clickstream data yields high-resolution maps of science
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2652715/
https://www.ncbi.nlm.nih.gov/pubmed/19277205
http://dx.doi.org/10.1371/journal.pone.0004803
work_keys_str_mv AT bollenjohan clickstreamdatayieldshighresolutionmapsofscience
AT vandesompelherbert clickstreamdatayieldshighresolutionmapsofscience
AT hagbergaric clickstreamdatayieldshighresolutionmapsofscience
AT bettencourtluis clickstreamdatayieldshighresolutionmapsofscience
AT chuteryan clickstreamdatayieldshighresolutionmapsofscience
AT rodriguezmarkoa clickstreamdatayieldshighresolutionmapsofscience
AT balakirevalyudmila clickstreamdatayieldshighresolutionmapsofscience