Cargando…

An efficient approach for textual data classification using deep learning

Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text class...

Descripción completa

Detalles Bibliográficos
Autores principales: Alqahtani, Abdullah, Ullah Khan, Habib, Alsubai, Shtwai, Sha, Mohemmed, Almadhor, Ahmad, Iqbal, Tayyab, Abbas, Sidra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521674/
https://www.ncbi.nlm.nih.gov/pubmed/36185709
http://dx.doi.org/10.3389/fncom.2022.992296
_version_ 1784799891968491520
author Alqahtani, Abdullah
Ullah Khan, Habib
Alsubai, Shtwai
Sha, Mohemmed
Almadhor, Ahmad
Iqbal, Tayyab
Abbas, Sidra
author_facet Alqahtani, Abdullah
Ullah Khan, Habib
Alsubai, Shtwai
Sha, Mohemmed
Almadhor, Ahmad
Iqbal, Tayyab
Abbas, Sidra
author_sort Alqahtani, Abdullah
collection PubMed
description Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92% accuracy outperforming all other model and baseline studies.
format Online
Article
Text
id pubmed-9521674
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95216742022-09-30 An efficient approach for textual data classification using deep learning Alqahtani, Abdullah Ullah Khan, Habib Alsubai, Shtwai Sha, Mohemmed Almadhor, Ahmad Iqbal, Tayyab Abbas, Sidra Front Comput Neurosci Neuroscience Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92% accuracy outperforming all other model and baseline studies. Frontiers Media S.A. 2022-09-15 /pmc/articles/PMC9521674/ /pubmed/36185709 http://dx.doi.org/10.3389/fncom.2022.992296 Text en Copyright © 2022 Alqahtani, Ullah Khan, Alsubai, Sha, Almadhor, Iqbal and Abbas. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Alqahtani, Abdullah
Ullah Khan, Habib
Alsubai, Shtwai
Sha, Mohemmed
Almadhor, Ahmad
Iqbal, Tayyab
Abbas, Sidra
An efficient approach for textual data classification using deep learning
title An efficient approach for textual data classification using deep learning
title_full An efficient approach for textual data classification using deep learning
title_fullStr An efficient approach for textual data classification using deep learning
title_full_unstemmed An efficient approach for textual data classification using deep learning
title_short An efficient approach for textual data classification using deep learning
title_sort efficient approach for textual data classification using deep learning
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521674/
https://www.ncbi.nlm.nih.gov/pubmed/36185709
http://dx.doi.org/10.3389/fncom.2022.992296
work_keys_str_mv AT alqahtaniabdullah anefficientapproachfortextualdataclassificationusingdeeplearning
AT ullahkhanhabib anefficientapproachfortextualdataclassificationusingdeeplearning
AT alsubaishtwai anefficientapproachfortextualdataclassificationusingdeeplearning
AT shamohemmed anefficientapproachfortextualdataclassificationusingdeeplearning
AT almadhorahmad anefficientapproachfortextualdataclassificationusingdeeplearning
AT iqbaltayyab anefficientapproachfortextualdataclassificationusingdeeplearning
AT abbassidra anefficientapproachfortextualdataclassificationusingdeeplearning
AT alqahtaniabdullah efficientapproachfortextualdataclassificationusingdeeplearning
AT ullahkhanhabib efficientapproachfortextualdataclassificationusingdeeplearning
AT alsubaishtwai efficientapproachfortextualdataclassificationusingdeeplearning
AT shamohemmed efficientapproachfortextualdataclassificationusingdeeplearning
AT almadhorahmad efficientapproachfortextualdataclassificationusingdeeplearning
AT iqbaltayyab efficientapproachfortextualdataclassificationusingdeeplearning
AT abbassidra efficientapproachfortextualdataclassificationusingdeeplearning