Cargando…

Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods

Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods,...

Descripción completa

Detalles Bibliográficos
Autores principales: Šubelj, Lovro, van Eck, Nees Jan, Waltman, Ludo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4849655/
https://www.ncbi.nlm.nih.gov/pubmed/27124610
http://dx.doi.org/10.1371/journal.pone.0154404
_version_ 1782429571431268352
author Šubelj, Lovro
van Eck, Nees Jan
Waltman, Ludo
author_facet Šubelj, Lovro
van Eck, Nees Jan
Waltman, Ludo
author_sort Šubelj, Lovro
collection PubMed
description Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community.
format Online
Article
Text
id pubmed-4849655
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-48496552016-05-07 Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods Šubelj, Lovro van Eck, Nees Jan Waltman, Ludo PLoS One Research Article Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community. Public Library of Science 2016-04-28 /pmc/articles/PMC4849655/ /pubmed/27124610 http://dx.doi.org/10.1371/journal.pone.0154404 Text en © 2016 Šubelj et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Šubelj, Lovro
van Eck, Nees Jan
Waltman, Ludo
Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title_full Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title_fullStr Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title_full_unstemmed Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title_short Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods
title_sort clustering scientific publications based on citation relations: a systematic comparison of different methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4849655/
https://www.ncbi.nlm.nih.gov/pubmed/27124610
http://dx.doi.org/10.1371/journal.pone.0154404
work_keys_str_mv AT subeljlovro clusteringscientificpublicationsbasedoncitationrelationsasystematiccomparisonofdifferentmethods
AT vaneckneesjan clusteringscientificpublicationsbasedoncitationrelationsasystematiccomparisonofdifferentmethods
AT waltmanludo clusteringscientificpublicationsbasedoncitationrelationsasystematiccomparisonofdifferentmethods