Cargando…

Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm

A SARS-CoV-2 virus has spread around the globe since March 2020. Millions of people infected worldwide with coronavirus. People from every country expressed their sentiments about coronavirus on social media. The aim of this work is to determine the general public opinion of Indian Twitter users abo...

Descripción completa

Detalles Bibliográficos
Autores principales: Jain, Vipin, Kashyap, Kanchan Lata
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9589711/
https://www.ncbi.nlm.nih.gov/pubmed/36313485
http://dx.doi.org/10.1007/s11042-022-13937-2
_version_ 1784814361961824256
author Jain, Vipin
Kashyap, Kanchan Lata
author_facet Jain, Vipin
Kashyap, Kanchan Lata
author_sort Jain, Vipin
collection PubMed
description A SARS-CoV-2 virus has spread around the globe since March 2020. Millions of people infected worldwide with coronavirus. People from every country expressed their sentiments about coronavirus on social media. The aim of this work is to determine the general public opinion of Indian Twitter users about coronavirus. The Hindi tweets posted about COVID-19 is used as input data for sentiment analysis. The natural language processing is applied on input data for feature extraction. Further, the optimal features are selected from the pre-processed data using the metaheuristic based Grey wolf optimization technique. Finally, a hybrid of convolution neural network(CNN) and a long short-term memory (LSTM) model pair is employed to categorize the sentiments as positive, negative, and neutral. The outcome of the proposed model is compared with other machine learning techniques, namely, Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, Support vector machine (SVM), CNN, LSTM, LSTM–CNN, and CNN–LSTM. The highest accuracy of 87.75%, 88.41%, 87.89%, 85.54%, 89.11%, 91.46%, 88.72%, 91.54%, and 92.34% is obtained by Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, SVM, CNN, LSTM, LSTM–CNN, and CNN–LSTM, respectively. The proposed ensemble hybrid model gives the highest 95.54%, 91.44%, 89.63%, and 90.87% classification accuracy, precision, recall, and F-score, respectively.
format Online
Article
Text
id pubmed-9589711
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-95897112022-10-24 Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm Jain, Vipin Kashyap, Kanchan Lata Multimed Tools Appl Article A SARS-CoV-2 virus has spread around the globe since March 2020. Millions of people infected worldwide with coronavirus. People from every country expressed their sentiments about coronavirus on social media. The aim of this work is to determine the general public opinion of Indian Twitter users about coronavirus. The Hindi tweets posted about COVID-19 is used as input data for sentiment analysis. The natural language processing is applied on input data for feature extraction. Further, the optimal features are selected from the pre-processed data using the metaheuristic based Grey wolf optimization technique. Finally, a hybrid of convolution neural network(CNN) and a long short-term memory (LSTM) model pair is employed to categorize the sentiments as positive, negative, and neutral. The outcome of the proposed model is compared with other machine learning techniques, namely, Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, Support vector machine (SVM), CNN, LSTM, LSTM–CNN, and CNN–LSTM. The highest accuracy of 87.75%, 88.41%, 87.89%, 85.54%, 89.11%, 91.46%, 88.72%, 91.54%, and 92.34% is obtained by Random Forest, Decision Tree, K-Nearest Neighbor, Naive Bayes, SVM, CNN, LSTM, LSTM–CNN, and CNN–LSTM, respectively. The proposed ensemble hybrid model gives the highest 95.54%, 91.44%, 89.63%, and 90.87% classification accuracy, precision, recall, and F-score, respectively. Springer US 2022-10-24 2023 /pmc/articles/PMC9589711/ /pubmed/36313485 http://dx.doi.org/10.1007/s11042-022-13937-2 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Jain, Vipin
Kashyap, Kanchan Lata
Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title_full Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title_fullStr Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title_full_unstemmed Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title_short Ensemble hybrid model for Hindi COVID-19 text classification with metaheuristic optimization algorithm
title_sort ensemble hybrid model for hindi covid-19 text classification with metaheuristic optimization algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9589711/
https://www.ncbi.nlm.nih.gov/pubmed/36313485
http://dx.doi.org/10.1007/s11042-022-13937-2
work_keys_str_mv AT jainvipin ensemblehybridmodelforhindicovid19textclassificationwithmetaheuristicoptimizationalgorithm
AT kashyapkanchanlata ensemblehybridmodelforhindicovid19textclassificationwithmetaheuristicoptimizationalgorithm