Cargando…
The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research
MEDLINE is the National Library of Medicine's (NLM) journal citation database. It contains over 28 million references to biomedical and life science journal articles, and a key feature of the database is that all articles are indexed with NLM Medical Subject Headings (MeSH). The library employs...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley & Sons, Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9937663/ https://www.ncbi.nlm.nih.gov/pubmed/36819642 http://dx.doi.org/10.1002/asi.24722 |
_version_ | 1784890472477491200 |
---|---|
author | Rae, Alastair R. Mork, James G. Demner‐Fushman, Dina |
author_facet | Rae, Alastair R. Mork, James G. Demner‐Fushman, Dina |
author_sort | Rae, Alastair R. |
collection | PubMed |
description | MEDLINE is the National Library of Medicine's (NLM) journal citation database. It contains over 28 million references to biomedical and life science journal articles, and a key feature of the database is that all articles are indexed with NLM Medical Subject Headings (MeSH). The library employs a team of MeSH indexers, and in recent years they have been asked to index close to 1 million articles per year in order to keep MEDLINE up to date. An important part of the MEDLINE indexing process is the assignment of articles to indexers. High quality and timely indexing is only possible when articles are assigned to indexers with suitable expertise. This article introduces the NLM indexer assignment dataset: a large dataset of 4.2 million indexer article assignments for articles indexed between 2011 and 2019. The dataset is shown to be a valuable testbed for expert matching and assignment algorithms, and indexer article assignment is also found to be useful domain‐adaptive pre‐training for the closely related task of reviewer assignment. |
format | Online Article Text |
id | pubmed-9937663 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | John Wiley & Sons, Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-99376632023-04-14 The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research Rae, Alastair R. Mork, James G. Demner‐Fushman, Dina J Assoc Inf Sci Technol Research Articles MEDLINE is the National Library of Medicine's (NLM) journal citation database. It contains over 28 million references to biomedical and life science journal articles, and a key feature of the database is that all articles are indexed with NLM Medical Subject Headings (MeSH). The library employs a team of MeSH indexers, and in recent years they have been asked to index close to 1 million articles per year in order to keep MEDLINE up to date. An important part of the MEDLINE indexing process is the assignment of articles to indexers. High quality and timely indexing is only possible when articles are assigned to indexers with suitable expertise. This article introduces the NLM indexer assignment dataset: a large dataset of 4.2 million indexer article assignments for articles indexed between 2011 and 2019. The dataset is shown to be a valuable testbed for expert matching and assignment algorithms, and indexer article assignment is also found to be useful domain‐adaptive pre‐training for the closely related task of reviewer assignment. John Wiley & Sons, Inc. 2022-11-08 2023-02 /pmc/articles/PMC9937663/ /pubmed/36819642 http://dx.doi.org/10.1002/asi.24722 Text en Published 2022. This article is a U.S. Government work and is in the public domain in the USA. Journal of the Association for Information Science and Technology published by Wiley Periodicals LLC on behalf of Association for Information Science and Technology. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. |
spellingShingle | Research Articles Rae, Alastair R. Mork, James G. Demner‐Fushman, Dina The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title | The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title_full | The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title_fullStr | The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title_full_unstemmed | The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title_short | The National Library of Medicine indexer assignment dataset: A new large‐scale dataset for reviewer assignment research |
title_sort | national library of medicine indexer assignment dataset: a new large‐scale dataset for reviewer assignment research |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9937663/ https://www.ncbi.nlm.nih.gov/pubmed/36819642 http://dx.doi.org/10.1002/asi.24722 |
work_keys_str_mv | AT raealastairr thenationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch AT morkjamesg thenationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch AT demnerfushmandina thenationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch AT raealastairr nationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch AT morkjamesg nationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch AT demnerfushmandina nationallibraryofmedicineindexerassignmentdatasetanewlargescaledatasetforreviewerassignmentresearch |