Cargando…

Abstractive Arabic Text Summarization Based on Deep Learning

Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system's capabilities with all its new improvement. Many papers and research studies address this task in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wazery, Y. M., Saleh, Marwa E., Alharbi, Abdullah, Ali, Abdelmgeid A.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8767398/ https://www.ncbi.nlm.nih.gov/pubmed/35069714 http://dx.doi.org/10.1155/2022/1566890

_version_	1784634730510024704
author	Wazery, Y. M. Saleh, Marwa E. Alharbi, Abdullah Ali, Abdelmgeid A.
author_facet	Wazery, Y. M. Saleh, Marwa E. Alharbi, Abdullah Ali, Abdelmgeid A.
author_sort	Wazery, Y. M.
collection	PubMed
description	Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system's capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model.
format	Online Article Text
id	pubmed-8767398
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-87673982022-01-20 Abstractive Arabic Text Summarization Based on Deep Learning Wazery, Y. M. Saleh, Marwa E. Alharbi, Abdullah Ali, Abdelmgeid A. Comput Intell Neurosci Research Article Text summarization (TS) is considered one of the most difficult tasks in natural language processing (NLP). It is one of the most important challenges that stand against the modern computer system's capabilities with all its new improvement. Many papers and research studies address this task in literature but are being carried out in extractive summarization, and few of them are being carried out in abstractive summarization, especially in the Arabic language due to its complexity. In this paper, an abstractive Arabic text summarization system is proposed, based on a sequence-to-sequence model. This model works through two components, encoder and decoder. Our aim is to develop the sequence-to-sequence model using several deep artificial neural networks to investigate which of them achieves the best performance. Different layers of Gated Recurrent Units (GRU), Long Short-Term Memory (LSTM), and Bidirectional Long Short-Term Memory (BiLSTM) have been used to develop the encoder and the decoder. In addition, the global attention mechanism has been used because it provides better results than the local attention mechanism. Furthermore, AraBERT preprocess has been applied in the data preprocessing stage that helps the model to understand the Arabic words and achieves state-of-the-art results. Moreover, a comparison between the skip-gram and the continuous bag of words (CBOW) word2Vec word embedding models has been made. We have built these models using the Keras library and run-on Google Colab Jupiter notebook to run seamlessly. Finally, the proposed system is evaluated through ROUGE-1, ROUGE-2, ROUGE-L, and BLEU evaluation metrics. The experimental results show that three layers of BiLSTM hidden states at the encoder achieve the best performance. In addition, our proposed system outperforms the other latest research studies. Also, the results show that abstractive summarization models that use the skip-gram word2Vec model outperform the models that use the CBOW word2Vec model. Hindawi 2022-01-11 /pmc/articles/PMC8767398/ /pubmed/35069714 http://dx.doi.org/10.1155/2022/1566890 Text en Copyright © 2022 Y.M. Wazery et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Wazery, Y. M. Saleh, Marwa E. Alharbi, Abdullah Ali, Abdelmgeid A. Abstractive Arabic Text Summarization Based on Deep Learning
title	Abstractive Arabic Text Summarization Based on Deep Learning
title_full	Abstractive Arabic Text Summarization Based on Deep Learning
title_fullStr	Abstractive Arabic Text Summarization Based on Deep Learning
title_full_unstemmed	Abstractive Arabic Text Summarization Based on Deep Learning
title_short	Abstractive Arabic Text Summarization Based on Deep Learning
title_sort	abstractive arabic text summarization based on deep learning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8767398/ https://www.ncbi.nlm.nih.gov/pubmed/35069714 http://dx.doi.org/10.1155/2022/1566890
work_keys_str_mv	AT wazeryym abstractivearabictextsummarizationbasedondeeplearning AT salehmarwae abstractivearabictextsummarizationbasedondeeplearning AT alharbiabdullah abstractivearabictextsummarizationbasedondeeplearning AT aliabdelmgeida abstractivearabictextsummarizationbasedondeeplearning

Abstractive Arabic Text Summarization Based on Deep Learning

Ejemplares similares