Cargando…
Graph-based extractive text summarization method for Hausa text
Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization r...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168556/ https://www.ncbi.nlm.nih.gov/pubmed/37159449 http://dx.doi.org/10.1371/journal.pone.0285376 |
_version_ | 1785038877727129600 |
---|---|
author | Bichi, Abdulkadir Abubakar Samsudin, Ruhaidah Hassan, Rohayanti Hasan, Layla Rasheed Abdallah Ado Rogo, Abubakar |
author_facet | Bichi, Abdulkadir Abubakar Samsudin, Ruhaidah Hassan, Rohayanti Hasan, Layla Rasheed Abdallah Ado Rogo, Abubakar |
author_sort | Bichi, Abdulkadir Abubakar |
collection | PubMed |
description | Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%. |
format | Online Article Text |
id | pubmed-10168556 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-101685562023-05-10 Graph-based extractive text summarization method for Hausa text Bichi, Abdulkadir Abubakar Samsudin, Ruhaidah Hassan, Rohayanti Hasan, Layla Rasheed Abdallah Ado Rogo, Abubakar PLoS One Research Article Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%. Public Library of Science 2023-05-09 /pmc/articles/PMC10168556/ /pubmed/37159449 http://dx.doi.org/10.1371/journal.pone.0285376 Text en © 2023 Bichi et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Bichi, Abdulkadir Abubakar Samsudin, Ruhaidah Hassan, Rohayanti Hasan, Layla Rasheed Abdallah Ado Rogo, Abubakar Graph-based extractive text summarization method for Hausa text |
title | Graph-based extractive text summarization method for Hausa text |
title_full | Graph-based extractive text summarization method for Hausa text |
title_fullStr | Graph-based extractive text summarization method for Hausa text |
title_full_unstemmed | Graph-based extractive text summarization method for Hausa text |
title_short | Graph-based extractive text summarization method for Hausa text |
title_sort | graph-based extractive text summarization method for hausa text |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10168556/ https://www.ncbi.nlm.nih.gov/pubmed/37159449 http://dx.doi.org/10.1371/journal.pone.0285376 |
work_keys_str_mv | AT bichiabdulkadirabubakar graphbasedextractivetextsummarizationmethodforhausatext AT samsudinruhaidah graphbasedextractivetextsummarizationmethodforhausatext AT hassanrohayanti graphbasedextractivetextsummarizationmethodforhausatext AT hasanlaylarasheedabdallah graphbasedextractivetextsummarizationmethodforhausatext AT adorogoabubakar graphbasedextractivetextsummarizationmethodforhausatext |