Cargando…

Check-worthy claim detection across topics for automated fact-checking

An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifyi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Abumansour, Amani S., Zubiaga, Arkaitz
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2023
Materias:	Data Mining and Machine Learning
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280541/ https://www.ncbi.nlm.nih.gov/pubmed/37346573 http://dx.doi.org/10.7717/peerj-cs.1365

_version_	1785060817860820992
author	Abumansour, Amani S. Zubiaga, Arkaitz
author_facet	Abumansour, Amani S. Zubiaga, Arkaitz
author_sort	Abumansour, Amani S.
collection	PubMed
description	An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this article, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for few-shot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences.
format	Online Article Text
id	pubmed-10280541
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-102805412023-06-21 Check-worthy claim detection across topics for automated fact-checking Abumansour, Amani S. Zubiaga, Arkaitz PeerJ Comput Sci Data Mining and Machine Learning An important component of an automated fact-checking system is the claim check-worthiness detection system, which ranks sentences by prioritising them based on their need to be checked. Despite a body of research tackling the task, previous research has overlooked the challenging nature of identifying check-worthy claims across different topics. In this article, we assess and quantify the challenge of detecting check-worthy claims for new, unseen topics. After highlighting the problem, we propose the AraCWA model to mitigate the performance deterioration when detecting check-worthy claims across topics. The AraCWA model enables boosting the performance for new topics by incorporating two components for few-shot learning and data augmentation. Using a publicly available dataset of Arabic tweets consisting of 14 different topics, we demonstrate that our proposed data augmentation strategy achieves substantial improvements across topics overall, where the extent of the improvement varies across topics. Further, we analyse the semantic similarities between topics, suggesting that the similarity metric could be used as a proxy to determine the difficulty level of an unseen topic prior to undertaking the task of labelling the underlying sentences. PeerJ Inc. 2023-05-16 /pmc/articles/PMC10280541/ /pubmed/37346573 http://dx.doi.org/10.7717/peerj-cs.1365 Text en © 2023 Abumansour and Zubiaga https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Data Mining and Machine Learning Abumansour, Amani S. Zubiaga, Arkaitz Check-worthy claim detection across topics for automated fact-checking
title	Check-worthy claim detection across topics for automated fact-checking
title_full	Check-worthy claim detection across topics for automated fact-checking
title_fullStr	Check-worthy claim detection across topics for automated fact-checking
title_full_unstemmed	Check-worthy claim detection across topics for automated fact-checking
title_short	Check-worthy claim detection across topics for automated fact-checking
title_sort	check-worthy claim detection across topics for automated fact-checking
topic	Data Mining and Machine Learning
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280541/ https://www.ncbi.nlm.nih.gov/pubmed/37346573 http://dx.doi.org/10.7717/peerj-cs.1365
work_keys_str_mv	AT abumansouramanis checkworthyclaimdetectionacrosstopicsforautomatedfactchecking AT zubiagaarkaitz checkworthyclaimdetectionacrosstopicsforautomatedfactchecking

Check-worthy claim detection across topics for automated fact-checking

Ejemplares similares