Cargando…

Reaching for upper bound ROUGE score of extractive summarization methods

The extractive text summarization (ETS) method for finding the salient information from a text automatically uses the exact sentences from the source text. In this article, we answer the question of what quality of a summary we can achieve with ETS methods? To maximize the ROUGE-1 score, we used fiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Akhmetov, Iskander, Mussabayev, Rustam, Gelbukh, Alexander
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575858/
https://www.ncbi.nlm.nih.gov/pubmed/36262160
http://dx.doi.org/10.7717/peerj-cs.1103
_version_ 1784811404489916416
author Akhmetov, Iskander
Mussabayev, Rustam
Gelbukh, Alexander
author_facet Akhmetov, Iskander
Mussabayev, Rustam
Gelbukh, Alexander
author_sort Akhmetov, Iskander
collection PubMed
description The extractive text summarization (ETS) method for finding the salient information from a text automatically uses the exact sentences from the source text. In this article, we answer the question of what quality of a summary we can achieve with ETS methods? To maximize the ROUGE-1 score, we used five approaches: (1) adapted reduced variable neighborhood search (RVNS), (2) Greedy algorithm, (3) VNS initialized by Greedy algorithm results, (4) genetic algorithm, and (5) genetic algorithm initialized by the Greedy algorithm results. Furthermore, we ran experiments on articles from the arXive dataset. As a result, we found 0.59 and 0.25 scores for ROUGE-1 and ROUGE-2, respectively achievable by the approach, where the genetic algorithm initialized by the Greedy algorithm results, which happens to yield the best results out of the tested approaches. Moreover, those scores appear to be higher than scores obtained by the current state-of-the-art text summarization models: the best score in the literature for ROUGE-1 on the same data set is 0.46. Therefore, we have room for the development of ETS methods, which are now undeservedly forgotten.
format Online
Article
Text
id pubmed-9575858
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-95758582022-10-18 Reaching for upper bound ROUGE score of extractive summarization methods Akhmetov, Iskander Mussabayev, Rustam Gelbukh, Alexander PeerJ Comput Sci Artificial Intelligence The extractive text summarization (ETS) method for finding the salient information from a text automatically uses the exact sentences from the source text. In this article, we answer the question of what quality of a summary we can achieve with ETS methods? To maximize the ROUGE-1 score, we used five approaches: (1) adapted reduced variable neighborhood search (RVNS), (2) Greedy algorithm, (3) VNS initialized by Greedy algorithm results, (4) genetic algorithm, and (5) genetic algorithm initialized by the Greedy algorithm results. Furthermore, we ran experiments on articles from the arXive dataset. As a result, we found 0.59 and 0.25 scores for ROUGE-1 and ROUGE-2, respectively achievable by the approach, where the genetic algorithm initialized by the Greedy algorithm results, which happens to yield the best results out of the tested approaches. Moreover, those scores appear to be higher than scores obtained by the current state-of-the-art text summarization models: the best score in the literature for ROUGE-1 on the same data set is 0.46. Therefore, we have room for the development of ETS methods, which are now undeservedly forgotten. PeerJ Inc. 2022-09-26 /pmc/articles/PMC9575858/ /pubmed/36262160 http://dx.doi.org/10.7717/peerj-cs.1103 Text en © 2022 Akhmetov et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Akhmetov, Iskander
Mussabayev, Rustam
Gelbukh, Alexander
Reaching for upper bound ROUGE score of extractive summarization methods
title Reaching for upper bound ROUGE score of extractive summarization methods
title_full Reaching for upper bound ROUGE score of extractive summarization methods
title_fullStr Reaching for upper bound ROUGE score of extractive summarization methods
title_full_unstemmed Reaching for upper bound ROUGE score of extractive summarization methods
title_short Reaching for upper bound ROUGE score of extractive summarization methods
title_sort reaching for upper bound rouge score of extractive summarization methods
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575858/
https://www.ncbi.nlm.nih.gov/pubmed/36262160
http://dx.doi.org/10.7717/peerj-cs.1103
work_keys_str_mv AT akhmetoviskander reachingforupperboundrougescoreofextractivesummarizationmethods
AT mussabayevrustam reachingforupperboundrougescoreofextractivesummarizationmethods
AT gelbukhalexander reachingforupperboundrougescoreofextractivesummarizationmethods