Cargando…

Examining influential factors for acknowledgements classification using supervised learning

Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification o...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Min, Kang, Keun Young, Timakum, Tatsawan, Zhang, Xinyuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7021295/
https://www.ncbi.nlm.nih.gov/pubmed/32059035
http://dx.doi.org/10.1371/journal.pone.0228928
_version_ 1783497872511074304
author Song, Min
Kang, Keun Young
Timakum, Tatsawan
Zhang, Xinyuan
author_facet Song, Min
Kang, Keun Young
Timakum, Tatsawan
Zhang, Xinyuan
author_sort Song, Min
collection PubMed
description Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification of acknowledgements on a large scale of data. To this end, we first created a training dataset for acknowledgements classification by sampling the acknowledgements sections from the entire PubMed Central database. Second, we adopted various supervised learning algorithms to examine which algorithm performed best in what condition. In addition, we observed the factors affecting classification performance. We investigated the effects of the following three main aspects: classification algorithms, categories, and text representations. The CNN+Doc2Vec algorithm achieved the highest performance of 93.58% accuracy in the original dataset and 87.93% in the converted dataset. The experimental results indicated that the characteristics of categories and sentence patterns influenced the performance of classification. Most of the classifiers performed better on the categories of financial, peer interactive communication, and technical support compared to other classes.
format Online
Article
Text
id pubmed-7021295
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-70212952020-02-26 Examining influential factors for acknowledgements classification using supervised learning Song, Min Kang, Keun Young Timakum, Tatsawan Zhang, Xinyuan PLoS One Research Article Acknowledgements have been examined as important elements in measuring the contributions to and intellectual debts of a scientific publication. Unlike previous studies that were limited in the scope of analysis and manual examination. The present study aimed to conduct the automatic classification of acknowledgements on a large scale of data. To this end, we first created a training dataset for acknowledgements classification by sampling the acknowledgements sections from the entire PubMed Central database. Second, we adopted various supervised learning algorithms to examine which algorithm performed best in what condition. In addition, we observed the factors affecting classification performance. We investigated the effects of the following three main aspects: classification algorithms, categories, and text representations. The CNN+Doc2Vec algorithm achieved the highest performance of 93.58% accuracy in the original dataset and 87.93% in the converted dataset. The experimental results indicated that the characteristics of categories and sentence patterns influenced the performance of classification. Most of the classifiers performed better on the categories of financial, peer interactive communication, and technical support compared to other classes. Public Library of Science 2020-02-14 /pmc/articles/PMC7021295/ /pubmed/32059035 http://dx.doi.org/10.1371/journal.pone.0228928 Text en © 2020 Song et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Song, Min
Kang, Keun Young
Timakum, Tatsawan
Zhang, Xinyuan
Examining influential factors for acknowledgements classification using supervised learning
title Examining influential factors for acknowledgements classification using supervised learning
title_full Examining influential factors for acknowledgements classification using supervised learning
title_fullStr Examining influential factors for acknowledgements classification using supervised learning
title_full_unstemmed Examining influential factors for acknowledgements classification using supervised learning
title_short Examining influential factors for acknowledgements classification using supervised learning
title_sort examining influential factors for acknowledgements classification using supervised learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7021295/
https://www.ncbi.nlm.nih.gov/pubmed/32059035
http://dx.doi.org/10.1371/journal.pone.0228928
work_keys_str_mv AT songmin examininginfluentialfactorsforacknowledgementsclassificationusingsupervisedlearning
AT kangkeunyoung examininginfluentialfactorsforacknowledgementsclassificationusingsupervisedlearning
AT timakumtatsawan examininginfluentialfactorsforacknowledgementsclassificationusingsupervisedlearning
AT zhangxinyuan examininginfluentialfactorsforacknowledgementsclassificationusingsupervisedlearning