Cargando…
Text classification technique for discovering country-based publications from international COVID-19 publications
OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of th...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328158/ https://www.ncbi.nlm.nih.gov/pubmed/37426592 http://dx.doi.org/10.1177/20552076231185674 |
_version_ | 1785069739370872832 |
---|---|
author | Danesh, Farshid Dastani, Meisam |
author_facet | Danesh, Farshid Dastani, Meisam |
author_sort | Danesh, Farshid |
collection | PubMed |
description | OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of the present paper is to discover country-based publications from international COVID-19 publications with text classification techniques. METHODS: The present paper is applied research that has been performed using text-mining techniques such as clustering and text classification. The statistical population is all COVID-19 publications from PubMed Central® (PMC), extracted from November 2019 to June 2021. Latent Dirichlet allocation (LDA) was used for clustering, and support vector machine (SVM), scikit-learn library, and Python programming language were used for text classification. Text classification was applied to discover the consistency of Iranian and international topics. RESULTS: The findings showed that seven topics were extracted using the LDA algorithm for international and Iranian publications on COVID-19. Moreover, the COVID-19 publications show the largest share in the subject area of “Social and Technology in COVID-19” at the international (April 2021) and national (February 2021) levels with 50.61% and 39.44%, respectively. The highest rate of publications at international and national levels was in April 2021 and February 2021, respectively. CONCLUSION: One of the most important results of this study was discovering a common trend and consistency of Iranian and international publications on COVID-19. Accordingly, in the topic category “Covid-19 Proteins: Vaccine and Antibody Response,” Iranian publications have a common publishing and research trend with international ones. |
format | Online Article Text |
id | pubmed-10328158 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-103281582023-07-08 Text classification technique for discovering country-based publications from international COVID-19 publications Danesh, Farshid Dastani, Meisam Digit Health Original Research OBJECTIVE: The significant increase in the number of COVID-19 publications, on the one hand, and the strategic importance of this subject area for research and treatment systems in the health field, on the other hand, reveals the need for text-mining research more than ever. The main objective of the present paper is to discover country-based publications from international COVID-19 publications with text classification techniques. METHODS: The present paper is applied research that has been performed using text-mining techniques such as clustering and text classification. The statistical population is all COVID-19 publications from PubMed Central® (PMC), extracted from November 2019 to June 2021. Latent Dirichlet allocation (LDA) was used for clustering, and support vector machine (SVM), scikit-learn library, and Python programming language were used for text classification. Text classification was applied to discover the consistency of Iranian and international topics. RESULTS: The findings showed that seven topics were extracted using the LDA algorithm for international and Iranian publications on COVID-19. Moreover, the COVID-19 publications show the largest share in the subject area of “Social and Technology in COVID-19” at the international (April 2021) and national (February 2021) levels with 50.61% and 39.44%, respectively. The highest rate of publications at international and national levels was in April 2021 and February 2021, respectively. CONCLUSION: One of the most important results of this study was discovering a common trend and consistency of Iranian and international publications on COVID-19. Accordingly, in the topic category “Covid-19 Proteins: Vaccine and Antibody Response,” Iranian publications have a common publishing and research trend with international ones. SAGE Publications 2023-07-03 /pmc/articles/PMC10328158/ /pubmed/37426592 http://dx.doi.org/10.1177/20552076231185674 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Original Research Danesh, Farshid Dastani, Meisam Text classification technique for discovering country-based publications from international COVID-19 publications |
title | Text classification technique for discovering country-based publications from international COVID-19 publications |
title_full | Text classification technique for discovering country-based publications from international COVID-19 publications |
title_fullStr | Text classification technique for discovering country-based publications from international COVID-19 publications |
title_full_unstemmed | Text classification technique for discovering country-based publications from international COVID-19 publications |
title_short | Text classification technique for discovering country-based publications from international COVID-19 publications |
title_sort | text classification technique for discovering country-based publications from international covid-19 publications |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328158/ https://www.ncbi.nlm.nih.gov/pubmed/37426592 http://dx.doi.org/10.1177/20552076231185674 |
work_keys_str_mv | AT daneshfarshid textclassificationtechniquefordiscoveringcountrybasedpublicationsfrominternationalcovid19publications AT dastanimeisam textclassificationtechniquefordiscoveringcountrybasedpublicationsfrominternationalcovid19publications |