Cargando…

Graph-based extractive text summarization method for Hausa text

Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization r...

Descripción completa

Detalles Bibliográficos
Autores principales: Bichi, Abdulkadir Abubakar, Samsudin, Ruhaidah, Hassan, Rohayanti, Hasan, Layla Rasheed Abdallah, Ado Rogo, Abubakar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168556/
https://www.ncbi.nlm.nih.gov/pubmed/37159449
http://dx.doi.org/10.1371/journal.pone.0285376
_version_ 1785038877727129600
author Bichi, Abdulkadir Abubakar
Samsudin, Ruhaidah
Hassan, Rohayanti
Hasan, Layla Rasheed Abdallah
Ado Rogo, Abubakar
author_facet Bichi, Abdulkadir Abubakar
Samsudin, Ruhaidah
Hassan, Rohayanti
Hasan, Layla Rasheed Abdallah
Ado Rogo, Abubakar
author_sort Bichi, Abdulkadir Abubakar
collection PubMed
description Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%.
format Online
Article
Text
id pubmed-10168556
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101685562023-05-10 Graph-based extractive text summarization method for Hausa text Bichi, Abdulkadir Abubakar Samsudin, Ruhaidah Hassan, Rohayanti Hasan, Layla Rasheed Abdallah Ado Rogo, Abubakar PLoS One Research Article Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%. Public Library of Science 2023-05-09 /pmc/articles/PMC10168556/ /pubmed/37159449 http://dx.doi.org/10.1371/journal.pone.0285376 Text en © 2023 Bichi et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bichi, Abdulkadir Abubakar
Samsudin, Ruhaidah
Hassan, Rohayanti
Hasan, Layla Rasheed Abdallah
Ado Rogo, Abubakar
Graph-based extractive text summarization method for Hausa text
title Graph-based extractive text summarization method for Hausa text
title_full Graph-based extractive text summarization method for Hausa text
title_fullStr Graph-based extractive text summarization method for Hausa text
title_full_unstemmed Graph-based extractive text summarization method for Hausa text
title_short Graph-based extractive text summarization method for Hausa text
title_sort graph-based extractive text summarization method for hausa text
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168556/
https://www.ncbi.nlm.nih.gov/pubmed/37159449
http://dx.doi.org/10.1371/journal.pone.0285376
work_keys_str_mv AT bichiabdulkadirabubakar graphbasedextractivetextsummarizationmethodforhausatext
AT samsudinruhaidah graphbasedextractivetextsummarizationmethodforhausatext
AT hassanrohayanti graphbasedextractivetextsummarizationmethodforhausatext
AT hasanlaylarasheedabdallah graphbasedextractivetextsummarizationmethodforhausatext
AT adorogoabubakar graphbasedextractivetextsummarizationmethodforhausatext