Cargando…

Automatic Identification of Information Quality Metrics in Health News Stories

Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-relat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Al-Jefri, Majed, Evans, Roger, Lee, Joon, Ghezzi, Pietro
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Public Health
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775604/ https://www.ncbi.nlm.nih.gov/pubmed/33392124 http://dx.doi.org/10.3389/fpubh.2020.515347

_version_	1783630505517776896
author	Al-Jefri, Majed Evans, Roger Lee, Joon Ghezzi, Pietro
author_facet	Al-Jefri, Majed Evans, Roger Lee, Joon Ghezzi, Pietro
author_sort	Al-Jefri, Majed
collection	PubMed
description	Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning. Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT. Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F(1) measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process. Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment.
format	Online Article Text
id	pubmed-7775604
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-77756042021-01-02 Automatic Identification of Information Quality Metrics in Health News Stories Al-Jefri, Majed Evans, Roger Lee, Joon Ghezzi, Pietro Front Public Health Public Health Objective: Many online and printed media publish health news of questionable trustworthiness and it may be difficult for laypersons to determine the information quality of such articles. The purpose of this work was to propose a methodology for the automatic assessment of the quality of health-related news stories using natural language processing and machine learning. Materials and Methods: We used a database from the website HealthNewsReview.org that aims to improve the public dialogue about health care. HealthNewsReview.org developed a set of criteria to critically analyze health care interventions' claims. In this work, we attempt to automate the evaluation process by identifying the indicators of those criteria using natural language processing-based machine learning on a corpus of more than 1,300 news stories. We explored features ranging from simple n-grams to more advanced linguistic features and optimized the feature selection for each task. Additionally, we experimented with the use of pre-trained natural language model BERT. Results: For some criteria, such as mention of costs, benefits, harms, and “disease-mongering,” the evaluation results were promising with an F(1) measure reaching 81.94%, while for others the results were less satisfactory due to the dataset size, the need of external knowledge, or the subjectivity in the evaluation process. Conclusion: These used criteria are more challenging than those addressed by previous work, and our aim was to investigate how much more difficult the machine learning task was, and how and why it varied between criteria. For some criteria, the obtained results were promising; however, automated evaluation of the other criteria may not yet replace the manual evaluation process where human experts interpret text senses and make use of external knowledge in their assessment. Frontiers Media S.A. 2020-12-18 /pmc/articles/PMC7775604/ /pubmed/33392124 http://dx.doi.org/10.3389/fpubh.2020.515347 Text en Copyright © 2020 Al-Jefri, Evans, Lee and Ghezzi. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Public Health Al-Jefri, Majed Evans, Roger Lee, Joon Ghezzi, Pietro Automatic Identification of Information Quality Metrics in Health News Stories
title	Automatic Identification of Information Quality Metrics in Health News Stories
title_full	Automatic Identification of Information Quality Metrics in Health News Stories
title_fullStr	Automatic Identification of Information Quality Metrics in Health News Stories
title_full_unstemmed	Automatic Identification of Information Quality Metrics in Health News Stories
title_short	Automatic Identification of Information Quality Metrics in Health News Stories
title_sort	automatic identification of information quality metrics in health news stories
topic	Public Health
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7775604/ https://www.ncbi.nlm.nih.gov/pubmed/33392124 http://dx.doi.org/10.3389/fpubh.2020.515347
work_keys_str_mv	AT aljefrimajed automaticidentificationofinformationqualitymetricsinhealthnewsstories AT evansroger automaticidentificationofinformationqualitymetricsinhealthnewsstories AT leejoon automaticidentificationofinformationqualitymetricsinhealthnewsstories AT ghezzipietro automaticidentificationofinformationqualitymetricsinhealthnewsstories

Automatic Identification of Information Quality Metrics in Health News Stories

Ejemplares similares