Cargando…

CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19

The number of research articles published on COVID-19 has dramatically increased since the outbreak of the pandemic in November 2019. This absurd rate of productivity in research articles leads to information overload. It has increasingly become urgent for researchers and medical associations to sta...

Descripción completa

Detalles Bibliográficos
Autores principales:	Karotia, Akanksha, Susan, Seba
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131559/ https://www.ncbi.nlm.nih.gov/pubmed/37359325 http://dx.doi.org/10.1007/s11227-023-05291-3

_version_	1785031203355623424
author	Karotia, Akanksha Susan, Seba
author_facet	Karotia, Akanksha Susan, Seba
author_sort	Karotia, Akanksha
collection	PubMed
description	The number of research articles published on COVID-19 has dramatically increased since the outbreak of the pandemic in November 2019. This absurd rate of productivity in research articles leads to information overload. It has increasingly become urgent for researchers and medical associations to stay up to date on the latest COVID-19 studies. To address information overload in COVID-19 scientific literature, the study presents a novel hybrid model named CovSumm, an unsupervised graph-based hybrid approach for single-document summarization, that is evaluated on the CORD-19 dataset. We have tested the proposed methodology on the scientific papers in the database dated from January 1, 2021 to December 31, 2021, consisting of 840 documents in total. The proposed text summarization is a hybrid of two distinctive extractive approaches (1) GenCompareSum (transformer-based approach) and (2) TextRank (graph-based approach). The sum of scores generated by both methods is used to rank the sentences for generating the summary. On the CORD-19, the recall-oriented understudy for gisting evaluation (ROUGE) score metric is used to compare the performance of the CovSumm model with various state-of-the-art techniques. The proposed method achieved the highest scores of ROUGE-1: 40.14%, ROUGE-2: 13.25%, and ROUGE-L: 36.32%. The proposed hybrid approach shows improved performance on the CORD-19 dataset when compared to existing unsupervised text summarization methods.
format	Online Article Text
id	pubmed-10131559
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-101315592023-04-27 CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19 Karotia, Akanksha Susan, Seba J Supercomput Article The number of research articles published on COVID-19 has dramatically increased since the outbreak of the pandemic in November 2019. This absurd rate of productivity in research articles leads to information overload. It has increasingly become urgent for researchers and medical associations to stay up to date on the latest COVID-19 studies. To address information overload in COVID-19 scientific literature, the study presents a novel hybrid model named CovSumm, an unsupervised graph-based hybrid approach for single-document summarization, that is evaluated on the CORD-19 dataset. We have tested the proposed methodology on the scientific papers in the database dated from January 1, 2021 to December 31, 2021, consisting of 840 documents in total. The proposed text summarization is a hybrid of two distinctive extractive approaches (1) GenCompareSum (transformer-based approach) and (2) TextRank (graph-based approach). The sum of scores generated by both methods is used to rank the sentences for generating the summary. On the CORD-19, the recall-oriented understudy for gisting evaluation (ROUGE) score metric is used to compare the performance of the CovSumm model with various state-of-the-art techniques. The proposed method achieved the highest scores of ROUGE-1: 40.14%, ROUGE-2: 13.25%, and ROUGE-L: 36.32%. The proposed hybrid approach shows improved performance on the CORD-19 dataset when compared to existing unsupervised text summarization methods. Springer US 2023-04-26 /pmc/articles/PMC10131559/ /pubmed/37359325 http://dx.doi.org/10.1007/s11227-023-05291-3 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Karotia, Akanksha Susan, Seba CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title	CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title_full	CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title_fullStr	CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title_full_unstemmed	CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title_short	CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
title_sort	covsumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for cord-19
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131559/ https://www.ncbi.nlm.nih.gov/pubmed/37359325 http://dx.doi.org/10.1007/s11227-023-05291-3
work_keys_str_mv	AT karotiaakanksha covsummanunsupervisedtransformercumgraphbasedhybriddocumentsummarizationmodelforcord19 AT susanseba covsummanunsupervisedtransformercumgraphbasedhybriddocumentsummarizationmodelforcord19

CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19

Ejemplares similares