Cargando…

‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers

The study of science at the individual scholar level requires the disambiguation of author names. The creation of author’s publication oeuvres involves matching the list of unique author names to names used in publication databases. Despite recent progress in the development of unique author identif...

Descripción completa

Detalles Bibliográficos
Autores principales: Reijnhoudt, Linda, Costas, Rodrigo, Noyons, Ed, Börner, Katy, Scharnhorst, Andrea
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190454/
https://www.ncbi.nlm.nih.gov/pubmed/25328257
http://dx.doi.org/10.1007/s11192-014-1256-0
_version_ 1782338514839404544
author Reijnhoudt, Linda
Costas, Rodrigo
Noyons, Ed
Börner, Katy
Scharnhorst, Andrea
author_facet Reijnhoudt, Linda
Costas, Rodrigo
Noyons, Ed
Börner, Katy
Scharnhorst, Andrea
author_sort Reijnhoudt, Linda
collection PubMed
description The study of science at the individual scholar level requires the disambiguation of author names. The creation of author’s publication oeuvres involves matching the list of unique author names to names used in publication databases. Despite recent progress in the development of unique author identifiers, e.g., ORCID, VIVO, or DAI, author disambiguation remains a key problem when it comes to large-scale bibliometric analysis using data from multiple databases. This study introduces and tests a new methodology called seed + expand for semi-automatic bibliographic data collection for a given set of individual authors. Specifically, we identify the oeuvre of a set of Dutch full professors during the period 1980–2011. In particular, we combine author records from a Dutch National Research Information System (NARCIS) with publication records from the Web of Science. Starting with an initial list of 8,378 names, we identify ‘seed publications’ for each author using five different approaches. Subsequently, we ‘expand’ the set of publications in three different approaches. The different approaches are compared and resulting oeuvres are evaluated on precision and recall using a ‘gold standard’ dataset of authors for which verified publications in the period 2001–2010 are available.
format Online
Article
Text
id pubmed-4190454
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-41904542014-10-15 ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers Reijnhoudt, Linda Costas, Rodrigo Noyons, Ed Börner, Katy Scharnhorst, Andrea Scientometrics Article The study of science at the individual scholar level requires the disambiguation of author names. The creation of author’s publication oeuvres involves matching the list of unique author names to names used in publication databases. Despite recent progress in the development of unique author identifiers, e.g., ORCID, VIVO, or DAI, author disambiguation remains a key problem when it comes to large-scale bibliometric analysis using data from multiple databases. This study introduces and tests a new methodology called seed + expand for semi-automatic bibliographic data collection for a given set of individual authors. Specifically, we identify the oeuvre of a set of Dutch full professors during the period 1980–2011. In particular, we combine author records from a Dutch National Research Information System (NARCIS) with publication records from the Web of Science. Starting with an initial list of 8,378 names, we identify ‘seed publications’ for each author using five different approaches. Subsequently, we ‘expand’ the set of publications in three different approaches. The different approaches are compared and resulting oeuvres are evaluated on precision and recall using a ‘gold standard’ dataset of authors for which verified publications in the period 2001–2010 are available. Springer Netherlands 2014-03-05 2014 /pmc/articles/PMC4190454/ /pubmed/25328257 http://dx.doi.org/10.1007/s11192-014-1256-0 Text en © The Author(s) 2014 https://creativecommons.org/licenses/by/4.0/ Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
spellingShingle Article
Reijnhoudt, Linda
Costas, Rodrigo
Noyons, Ed
Börner, Katy
Scharnhorst, Andrea
‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title_full ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title_fullStr ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title_full_unstemmed ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title_short ‘Seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
title_sort ‘seed + expand’: a general methodology for detecting publication oeuvres of individual researchers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190454/
https://www.ncbi.nlm.nih.gov/pubmed/25328257
http://dx.doi.org/10.1007/s11192-014-1256-0
work_keys_str_mv AT reijnhoudtlinda seedexpandageneralmethodologyfordetectingpublicationoeuvresofindividualresearchers
AT costasrodrigo seedexpandageneralmethodologyfordetectingpublicationoeuvresofindividualresearchers
AT noyonsed seedexpandageneralmethodologyfordetectingpublicationoeuvresofindividualresearchers
AT bornerkaty seedexpandageneralmethodologyfordetectingpublicationoeuvresofindividualresearchers
AT scharnhorstandrea seedexpandageneralmethodologyfordetectingpublicationoeuvresofindividualresearchers