Cargando…

Text classification technique for discovering country-based publications from international COVID-19 publications

OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of th...

Descripción completa

Detalles Bibliográficos
Autores principales: Danesh, Farshid, Dastani, Meisam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328158/
https://www.ncbi.nlm.nih.gov/pubmed/37426592
http://dx.doi.org/10.1177/20552076231185674
_version_ 1785069739370872832
author Danesh, Farshid
Dastani, Meisam
author_facet Danesh, Farshid
Dastani, Meisam
author_sort Danesh, Farshid
collection PubMed
description OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of the present paper is to discover country-based publications from international COVID-19 publications with text classification techniques. METHODS: The present paper is applied research that has been performed using text-mining techniques such as clustering and text classification. The statistical population is all COVID-19 publications from PubMed Central® (PMC), extracted from November 2019 to June 2021. Latent Dirichlet allocation (LDA) was used for clustering, and support vector machine (SVM), scikit-learn library, and Python programming language were used for text classification. Text classification was applied to discover the consistency of Iranian and international topics. RESULTS: The findings showed that seven topics were extracted using the LDA algorithm for international and Iranian publications on COVID-19. Moreover, the COVID-19 publications show the largest share in the subject area of “Social and Technology in COVID-19” at the international (April 2021) and national (February 2021) levels with 50.61% and 39.44%, respectively. The highest rate of publications at international and national levels was in April 2021 and February 2021, respectively. CONCLUSION: One of the most important results of this study was discovering a common trend and consistency of Iranian and international publications on COVID-19. Accordingly, in the topic category “Covid-19 Proteins: Vaccine and Antibody Response,” Iranian publications have a common publishing and research trend with international ones.
format Online
Article
Text
id pubmed-10328158
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-103281582023-07-08 Text classification technique for discovering country-based publications from international COVID-19 publications Danesh, Farshid Dastani, Meisam Digit Health Original Research OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of the present paper is to discover country-based publications from international COVID-19 publications with text classification techniques. METHODS: The present paper is applied research that has been performed using text-mining techniques such as clustering and text classification. The statistical population is all COVID-19 publications from PubMed Central® (PMC), extracted from November 2019 to June 2021. Latent Dirichlet allocation (LDA) was used for clustering, and support vector machine (SVM), scikit-learn library, and Python programming language were used for text classification. Text classification was applied to discover the consistency of Iranian and international topics. RESULTS: The findings showed that seven topics were extracted using the LDA algorithm for international and Iranian publications on COVID-19. Moreover, the COVID-19 publications show the largest share in the subject area of “Social and Technology in COVID-19” at the international (April 2021) and national (February 2021) levels with 50.61% and 39.44%, respectively. The highest rate of publications at international and national levels was in April 2021 and February 2021, respectively. CONCLUSION: One of the most important results of this study was discovering a common trend and consistency of Iranian and international publications on COVID-19. Accordingly, in the topic category “Covid-19 Proteins: Vaccine and Antibody Response,” Iranian publications have a common publishing and research trend with international ones. SAGE Publications 2023-07-03 /pmc/articles/PMC10328158/ /pubmed/37426592 http://dx.doi.org/10.1177/20552076231185674 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Danesh, Farshid
Dastani, Meisam
Text classification technique for discovering country-based publications from international COVID-19 publications
title Text classification technique for discovering country-based publications from international COVID-19 publications
title_full Text classification technique for discovering country-based publications from international COVID-19 publications
title_fullStr Text classification technique for discovering country-based publications from international COVID-19 publications
title_full_unstemmed Text classification technique for discovering country-based publications from international COVID-19 publications
title_short Text classification technique for discovering country-based publications from international COVID-19 publications
title_sort text classification technique for discovering country-based publications from international covid-19 publications
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328158/
https://www.ncbi.nlm.nih.gov/pubmed/37426592
http://dx.doi.org/10.1177/20552076231185674
work_keys_str_mv AT daneshfarshid textclassificationtechniquefordiscoveringcountrybasedpublicationsfrominternationalcovid19publications
AT dastanimeisam textclassificationtechniquefordiscoveringcountrybasedpublicationsfrominternationalcovid19publications