Cargando…

A Method for the Automated, Reliable Retrieval of Publication-Citation Records

BACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time...

Descripción completa

Detalles Bibliográficos
Autores principales: Ruths, Derek, Zamal, Faiyaz Al
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924380/
https://www.ncbi.nlm.nih.gov/pubmed/20808858
http://dx.doi.org/10.1371/journal.pone.0012133
_version_ 1782185585463525376
author Ruths, Derek
Zamal, Faiyaz Al
author_facet Ruths, Derek
Zamal, Faiyaz Al
author_sort Ruths, Derek
collection PubMed
description BACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time available to manually obtain or construct the publication-citation record. While online publication search engines have somewhat addressed these problems, using raw search results can yield inaccurate estimates of publication-citation records and citation indices. METHODOLOGY: In this paper, we present a new, automated method that produces estimates of an individual's publication-citation record from an individual's name and a set of domain-specific vocabulary that may occur in the individual's publication titles. Because this vocabulary can be harvested directly from a research web page or online (partial) publication list, our method delivers an easy way to obtain estimates of a publication-citation record and the relevant citation indices. Our method works by applying a series of stringent name and content filters to the raw publication search results returned by an online publication search engine. In this paper, our method is run using Google Scholar, but the underlying filters can be easily applied to any existing publication search engine. When compared against a manually constructed data set of individuals and their publication-citation records, our method provides significant improvements over raw search results. The estimated publication-citation records returned by our method have an average sensitivity of [Image: see text] and specificity of [Image: see text] (in contrast to raw search result specificity of less than 10%). When citation indices are computed using these records, the estimated indices are within [Image: see text] of the true value, compared to raw search results which have overestimates of, on average, [Image: see text]. CONCLUSIONS: These results confirm that our method provides significantly improved estimates over raw search results, and these can either be used directly for large-scale (departmental or university) analysis or further refined manually to quickly give accurate publication-citation records.
format Text
id pubmed-2924380
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29243802010-08-31 A Method for the Automated, Reliable Retrieval of Publication-Citation Records Ruths, Derek Zamal, Faiyaz Al PLoS One Research Article BACKGROUND: Publication records and citation indices often are used to evaluate academic performance. For this reason, obtaining or computing them accurately is important. This can be difficult, largely due to a lack of complete knowledge of an individual's publication list and/or lack of time available to manually obtain or construct the publication-citation record. While online publication search engines have somewhat addressed these problems, using raw search results can yield inaccurate estimates of publication-citation records and citation indices. METHODOLOGY: In this paper, we present a new, automated method that produces estimates of an individual's publication-citation record from an individual's name and a set of domain-specific vocabulary that may occur in the individual's publication titles. Because this vocabulary can be harvested directly from a research web page or online (partial) publication list, our method delivers an easy way to obtain estimates of a publication-citation record and the relevant citation indices. Our method works by applying a series of stringent name and content filters to the raw publication search results returned by an online publication search engine. In this paper, our method is run using Google Scholar, but the underlying filters can be easily applied to any existing publication search engine. When compared against a manually constructed data set of individuals and their publication-citation records, our method provides significant improvements over raw search results. The estimated publication-citation records returned by our method have an average sensitivity of [Image: see text] and specificity of [Image: see text] (in contrast to raw search result specificity of less than 10%). When citation indices are computed using these records, the estimated indices are within [Image: see text] of the true value, compared to raw search results which have overestimates of, on average, [Image: see text]. CONCLUSIONS: These results confirm that our method provides significantly improved estimates over raw search results, and these can either be used directly for large-scale (departmental or university) analysis or further refined manually to quickly give accurate publication-citation records. Public Library of Science 2010-08-19 /pmc/articles/PMC2924380/ /pubmed/20808858 http://dx.doi.org/10.1371/journal.pone.0012133 Text en Ruths, Zamal. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ruths, Derek
Zamal, Faiyaz Al
A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title_full A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title_fullStr A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title_full_unstemmed A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title_short A Method for the Automated, Reliable Retrieval of Publication-Citation Records
title_sort method for the automated, reliable retrieval of publication-citation records
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2924380/
https://www.ncbi.nlm.nih.gov/pubmed/20808858
http://dx.doi.org/10.1371/journal.pone.0012133
work_keys_str_mv AT ruthsderek amethodfortheautomatedreliableretrievalofpublicationcitationrecords
AT zamalfaiyazal amethodfortheautomatedreliableretrievalofpublicationcitationrecords
AT ruthsderek methodfortheautomatedreliableretrievalofpublicationcitationrecords
AT zamalfaiyazal methodfortheautomatedreliableretrievalofpublicationcitationrecords