Cargando…

Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature

The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conc...

Descripción completa

Detalles Bibliográficos
Autores principales: Ozyurt, Ibrahim Burak, Grethe, Jeffrey S., Martone, Maryann E., Bandrowski, Anita E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156472/
https://www.ncbi.nlm.nih.gov/pubmed/26730820
http://dx.doi.org/10.1371/journal.pone.0146300
_version_ 1782481271600971776
author Ozyurt, Ibrahim Burak
Grethe, Jeffrey S.
Martone, Maryann E.
Bandrowski, Anita E.
author_facet Ozyurt, Ibrahim Burak
Grethe, Jeffrey S.
Martone, Maryann E.
Bandrowski, Anita E.
author_sort Ozyurt, Ibrahim Burak
collection PubMed
description The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking.
format Online
Article
Text
id pubmed-5156472
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-51564722016-12-21 Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature Ozyurt, Ibrahim Burak Grethe, Jeffrey S. Martone, Maryann E. Bandrowski, Anita E. PLoS One Research Article The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking. Public Library of Science 2016-01-05 /pmc/articles/PMC5156472/ /pubmed/26730820 http://dx.doi.org/10.1371/journal.pone.0146300 Text en © 2016 Ozyurt et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
spellingShingle Research Article
Ozyurt, Ibrahim Burak
Grethe, Jeffrey S.
Martone, Maryann E.
Bandrowski, Anita E.
Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title_full Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title_fullStr Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title_full_unstemmed Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title_short Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
title_sort resource disambiguator for the web: extracting biomedical resources and their citations from the scientific literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156472/
https://www.ncbi.nlm.nih.gov/pubmed/26730820
http://dx.doi.org/10.1371/journal.pone.0146300
work_keys_str_mv AT ozyurtibrahimburak resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature
AT grethejeffreys resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature
AT martonemaryanne resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature
AT bandrowskianitae resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature