Cargando…
Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature
The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conc...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156472/ https://www.ncbi.nlm.nih.gov/pubmed/26730820 http://dx.doi.org/10.1371/journal.pone.0146300 |
_version_ | 1782481271600971776 |
---|---|
author | Ozyurt, Ibrahim Burak Grethe, Jeffrey S. Martone, Maryann E. Bandrowski, Anita E. |
author_facet | Ozyurt, Ibrahim Burak Grethe, Jeffrey S. Martone, Maryann E. Bandrowski, Anita E. |
author_sort | Ozyurt, Ibrahim Burak |
collection | PubMed |
description | The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking. |
format | Online Article Text |
id | pubmed-5156472 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-51564722016-12-21 Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature Ozyurt, Ibrahim Burak Grethe, Jeffrey S. Martone, Maryann E. Bandrowski, Anita E. PLoS One Research Article The NIF Registry developed and maintained by the Neuroscience Information Framework is a cooperative project aimed at cataloging research resources, e.g., software tools, databases and tissue banks, funded largely by governments and available as tools to research scientists. Although originally conceived for neuroscience, the NIF Registry has over the years broadened in the scope to include research resources of general relevance to biomedical research. The current number of research resources listed by the Registry numbers over 13K. The broadening in scope to biomedical science led us to re-christen the NIF Registry platform as SciCrunch. The NIF/SciCrunch Registry has been cataloging the resource landscape since 2006; as such, it serves as a valuable dataset for tracking the breadth, fate and utilization of these resources. Our experience shows research resources like databases are dynamic objects, that can change location and scope over time. Although each record is entered manually and human-curated, the current size of the registry requires tools that can aid in curation efforts to keep content up to date, including when and where such resources are used. To address this challenge, we have developed an open source tool suite, collectively termed RDW: Resource Disambiguator for the (Web). RDW is designed to help in the upkeep and curation of the registry as well as in enhancing the content of the registry by automated extraction of resource candidates from the literature. The RDW toolkit includes a URL extractor from papers, resource candidate screen, resource URL change tracker, resource content change tracker. Curators access these tools via a web based user interface. Several strategies are used to optimize these tools, including supervised and unsupervised learning algorithms as well as statistical text analysis. The complete tool suite is used to enhance and maintain the resource registry as well as track the usage of individual resources through an innovative literature citation index honed for research resources. Here we present an overview of the Registry and show how the RDW tools are used in curation and usage tracking. Public Library of Science 2016-01-05 /pmc/articles/PMC5156472/ /pubmed/26730820 http://dx.doi.org/10.1371/journal.pone.0146300 Text en © 2016 Ozyurt et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited |
spellingShingle | Research Article Ozyurt, Ibrahim Burak Grethe, Jeffrey S. Martone, Maryann E. Bandrowski, Anita E. Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title | Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title_full | Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title_fullStr | Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title_full_unstemmed | Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title_short | Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature |
title_sort | resource disambiguator for the web: extracting biomedical resources and their citations from the scientific literature |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5156472/ https://www.ncbi.nlm.nih.gov/pubmed/26730820 http://dx.doi.org/10.1371/journal.pone.0146300 |
work_keys_str_mv | AT ozyurtibrahimburak resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature AT grethejeffreys resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature AT martonemaryanne resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature AT bandrowskianitae resourcedisambiguatorforthewebextractingbiomedicalresourcesandtheircitationsfromthescientificliterature |