Cargando…

Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science

Unique identifiers (UID) are seen as an effective key to match identical publications across databases or identify duplicates in a database. The objective of the present study is to investigate how well UIDs work as match keys in the integration between Pure and SciVal, based on a case with publicat...

Descripción completa

Detalles Bibliográficos
Autores principales: Madsen, Heidi Holst, Madsen, Dicte, Gauffriau, Marianne
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5017295/
https://www.ncbi.nlm.nih.gov/pubmed/27635223
http://dx.doi.org/10.12688/f1000research.8913.2
_version_ 1782452717788069888
author Madsen, Heidi Holst
Madsen, Dicte
Gauffriau, Marianne
author_facet Madsen, Heidi Holst
Madsen, Dicte
Gauffriau, Marianne
author_sort Madsen, Heidi Holst
collection PubMed
description Unique identifiers (UID) are seen as an effective key to match identical publications across databases or identify duplicates in a database. The objective of the present study is to investigate how well UIDs work as match keys in the integration between Pure and SciVal, based on a case with publications from the health sciences. We evaluate the matching process based on information about coverage, precision, and characteristics of publications matched versus not matched with UIDs as the match keys. We analyze this information to detect errors, if any, in the matching process. As an example we also briefly discuss how publication sets formed by using UIDs as the match keys may affect the bibliometric indicators number of publications, number of citations, and the average number of citations per publication.  The objective is addressed in a literature review and a case study. The literature review shows that only a few studies evaluate how well UIDs work as a match key. From the literature we identify four error types: Duplicate digital object identifiers (DOI), incorrect DOIs in reference lists and databases, DOIs not registered by the database where a bibliometric analysis is performed, and erroneous optical or special character recognition. The case study explores the use of UIDs in the integration between the databases Pure and SciVal. Specifically journal publications in English are matched between the two databases. We find all error types except erroneous optical or special character recognition in our publication sets. In particular the duplicate DOIs constitute a problem for the calculation of bibliometric indicators as both keeping the duplicates to improve the reliability of citation counts and deleting them to improve the reliability of publication counts will distort the calculation of average number of citations per publication. The use of UIDs as a match key in citation linking is implemented in many settings, and the availability of UIDs may become critical for the inclusion of a publication or a database in a bibliometric analysis.
format Online
Article
Text
id pubmed-5017295
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-50172952016-09-14 Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science Madsen, Heidi Holst Madsen, Dicte Gauffriau, Marianne F1000Res Research Article Unique identifiers (UID) are seen as an effective key to match identical publications across databases or identify duplicates in a database. The objective of the present study is to investigate how well UIDs work as match keys in the integration between Pure and SciVal, based on a case with publications from the health sciences. We evaluate the matching process based on information about coverage, precision, and characteristics of publications matched versus not matched with UIDs as the match keys. We analyze this information to detect errors, if any, in the matching process. As an example we also briefly discuss how publication sets formed by using UIDs as the match keys may affect the bibliometric indicators number of publications, number of citations, and the average number of citations per publication.  The objective is addressed in a literature review and a case study. The literature review shows that only a few studies evaluate how well UIDs work as a match key. From the literature we identify four error types: Duplicate digital object identifiers (DOI), incorrect DOIs in reference lists and databases, DOIs not registered by the database where a bibliometric analysis is performed, and erroneous optical or special character recognition. The case study explores the use of UIDs in the integration between the databases Pure and SciVal. Specifically journal publications in English are matched between the two databases. We find all error types except erroneous optical or special character recognition in our publication sets. In particular the duplicate DOIs constitute a problem for the calculation of bibliometric indicators as both keeping the duplicates to improve the reliability of citation counts and deleting them to improve the reliability of publication counts will distort the calculation of average number of citations per publication. The use of UIDs as a match key in citation linking is implemented in many settings, and the availability of UIDs may become critical for the inclusion of a publication or a database in a bibliometric analysis. F1000Research 2016-09-06 /pmc/articles/PMC5017295/ /pubmed/27635223 http://dx.doi.org/10.12688/f1000research.8913.2 Text en Copyright: © 2016 Madsen HH et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Madsen, Heidi Holst
Madsen, Dicte
Gauffriau, Marianne
Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title_full Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title_fullStr Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title_full_unstemmed Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title_short Evaluation of unique identifiers used as keys to match identical publications in Pure and SciVal – a case study from health science
title_sort evaluation of unique identifiers used as keys to match identical publications in pure and scival – a case study from health science
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5017295/
https://www.ncbi.nlm.nih.gov/pubmed/27635223
http://dx.doi.org/10.12688/f1000research.8913.2
work_keys_str_mv AT madsenheidiholst evaluationofuniqueidentifiersusedaskeystomatchidenticalpublicationsinpureandscivalacasestudyfromhealthscience
AT madsendicte evaluationofuniqueidentifiersusedaskeystomatchidenticalpublicationsinpureandscivalacasestudyfromhealthscience
AT gauffriaumarianne evaluationofuniqueidentifiersusedaskeystomatchidenticalpublicationsinpureandscivalacasestudyfromhealthscience