Cargando…

Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling

Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques b...

Descripción completa

Detalles Bibliográficos
Autores principales: Rivest, Maxime, Vignola-Gagné, Etienne, Archambault, Éric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112690/
https://www.ncbi.nlm.nih.gov/pubmed/33974653
http://dx.doi.org/10.1371/journal.pone.0251493
_version_ 1783690720596459520
author Rivest, Maxime
Vignola-Gagné, Etienne
Archambault, Éric
author_facet Rivest, Maxime
Vignola-Gagné, Etienne
Archambault, Éric
author_sort Rivest, Maxime
collection PubMed
description Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques becomes available and as new research topics emerge. Convolutional neural networks, a subset of “deep learning” approaches, have recently offered novel and highly performant methods for classifying voluminous corpora of text. This article benchmarks a deep learning classification technique on more than 40 million scientific articles and on tens of thousands of scholarly journals. The comparison is performed against bibliographic coupling-, direct citation-, and manual-based classifications—the established and most widely used approaches in the field of bibliometrics, and by extension, in many science and innovation policy activities such as grant competition management. The results reveal that the performance of this first iteration of a deep learning approach is equivalent to the graph-based bibliometric approaches. All methods presented are also on par with manual classification. Somewhat surprisingly, no machine learning approaches were found to clearly outperform the simple label propagation approach that is direct citation. In conclusion, deep learning is promising because it performed just as well as the other approaches but has more flexibility to be further improved. For example, a deep neural network incorporating information from the citation network is likely to hold the key to an even better classification algorithm.
format Online
Article
Text
id pubmed-8112690
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81126902021-05-24 Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling Rivest, Maxime Vignola-Gagné, Etienne Archambault, Éric PLoS One Research Article Classification schemes for scientific activity and publications underpin a large swath of research evaluation practices at the organizational, governmental, and national levels. Several research classifications are currently in use, and they require continuous work as new classification techniques becomes available and as new research topics emerge. Convolutional neural networks, a subset of “deep learning” approaches, have recently offered novel and highly performant methods for classifying voluminous corpora of text. This article benchmarks a deep learning classification technique on more than 40 million scientific articles and on tens of thousands of scholarly journals. The comparison is performed against bibliographic coupling-, direct citation-, and manual-based classifications—the established and most widely used approaches in the field of bibliometrics, and by extension, in many science and innovation policy activities such as grant competition management. The results reveal that the performance of this first iteration of a deep learning approach is equivalent to the graph-based bibliometric approaches. All methods presented are also on par with manual classification. Somewhat surprisingly, no machine learning approaches were found to clearly outperform the simple label propagation approach that is direct citation. In conclusion, deep learning is promising because it performed just as well as the other approaches but has more flexibility to be further improved. For example, a deep neural network incorporating information from the citation network is likely to hold the key to an even better classification algorithm. Public Library of Science 2021-05-11 /pmc/articles/PMC8112690/ /pubmed/33974653 http://dx.doi.org/10.1371/journal.pone.0251493 Text en © 2021 Rivest et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rivest, Maxime
Vignola-Gagné, Etienne
Archambault, Éric
Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title_full Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title_fullStr Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title_full_unstemmed Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title_short Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
title_sort article-level classification of scientific publications: a comparison of deep learning, direct citation and bibliographic coupling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8112690/
https://www.ncbi.nlm.nih.gov/pubmed/33974653
http://dx.doi.org/10.1371/journal.pone.0251493
work_keys_str_mv AT rivestmaxime articlelevelclassificationofscientificpublicationsacomparisonofdeeplearningdirectcitationandbibliographiccoupling
AT vignolagagneetienne articlelevelclassificationofscientificpublicationsacomparisonofdeeplearningdirectcitationandbibliographiccoupling
AT archambaulteric articlelevelclassificationofscientificpublicationsacomparisonofdeeplearningdirectcitationandbibliographiccoupling