Cargando…
Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach
BACKGROUND: Health science findings are primarily disseminated through manuscript publications. Information subsidies are used to communicate newsworthy findings to journalists in an effort to earn mass media coverage and further disseminate health science research to mass audiences. Journal editors...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054236/ https://www.ncbi.nlm.nih.gov/pubmed/27658571 http://dx.doi.org/10.2196/medinform.5353 |
_version_ | 1782458556552839168 |
---|---|
author | Zhang, Ye Willis, Erin Paul, Michael J Elhadad, Noémie Wallace, Byron C |
author_facet | Zhang, Ye Willis, Erin Paul, Michael J Elhadad, Noémie Wallace, Byron C |
author_sort | Zhang, Ye |
collection | PubMed |
description | BACKGROUND: Health science findings are primarily disseminated through manuscript publications. Information subsidies are used to communicate newsworthy findings to journalists in an effort to earn mass media coverage and further disseminate health science research to mass audiences. Journal editors and news journalists then select which news stories receive coverage and thus public attention. OBJECTIVE: This study aims to identify attributes of published health science articles that correlate with (1) journal editor issuance of press releases and (2) mainstream media coverage. METHODS: We constructed four novel datasets to identify factors that correlate with press release issuance and media coverage. These corpora include thousands of published articles, subsets of which received press release or mainstream media coverage. We used statistical machine learning methods to identify correlations between words in the science abstracts and press release issuance and media coverage. Further, we used a topic modeling-based machine learning approach to uncover latent topics predictive of the perceived newsworthiness of science articles. RESULTS: Both press release issuance for, and media coverage of, health science articles are predictable from corresponding journal article content. For the former task, we achieved average areas under the curve (AUCs) of 0.666 (SD 0.019) and 0.882 (SD 0.018) on two separate datasets, comprising 3024 and 10,760 articles, respectively. For the latter task, models realized mean AUCs of 0.591 (SD 0.044) and 0.783 (SD 0.022) on two datasets—in this case containing 422 and 28,910 pairs, respectively. We reported most-predictive words and topics for press release or news coverage. CONCLUSIONS: We have presented a novel data-driven characterization of content that renders health science “newsworthy.” The analysis provides new insights into the news coverage selection process. For example, it appears epidemiological papers concerning common behaviors (eg, alcohol consumption) tend to receive media attention. |
format | Online Article Text |
id | pubmed-5054236 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-50542362016-10-20 Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach Zhang, Ye Willis, Erin Paul, Michael J Elhadad, Noémie Wallace, Byron C JMIR Med Inform Original Paper BACKGROUND: Health science findings are primarily disseminated through manuscript publications. Information subsidies are used to communicate newsworthy findings to journalists in an effort to earn mass media coverage and further disseminate health science research to mass audiences. Journal editors and news journalists then select which news stories receive coverage and thus public attention. OBJECTIVE: This study aims to identify attributes of published health science articles that correlate with (1) journal editor issuance of press releases and (2) mainstream media coverage. METHODS: We constructed four novel datasets to identify factors that correlate with press release issuance and media coverage. These corpora include thousands of published articles, subsets of which received press release or mainstream media coverage. We used statistical machine learning methods to identify correlations between words in the science abstracts and press release issuance and media coverage. Further, we used a topic modeling-based machine learning approach to uncover latent topics predictive of the perceived newsworthiness of science articles. RESULTS: Both press release issuance for, and media coverage of, health science articles are predictable from corresponding journal article content. For the former task, we achieved average areas under the curve (AUCs) of 0.666 (SD 0.019) and 0.882 (SD 0.018) on two separate datasets, comprising 3024 and 10,760 articles, respectively. For the latter task, models realized mean AUCs of 0.591 (SD 0.044) and 0.783 (SD 0.022) on two datasets—in this case containing 422 and 28,910 pairs, respectively. We reported most-predictive words and topics for press release or news coverage. CONCLUSIONS: We have presented a novel data-driven characterization of content that renders health science “newsworthy.” The analysis provides new insights into the news coverage selection process. For example, it appears epidemiological papers concerning common behaviors (eg, alcohol consumption) tend to receive media attention. JMIR Publications 2016-09-22 /pmc/articles/PMC5054236/ /pubmed/27658571 http://dx.doi.org/10.2196/medinform.5353 Text en ©Ye Zhang, Erin Willis, Michael J Paul, Noémie Elhadad, Byron C Wallace. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 22.09.2016. https://creativecommons.org/licenses/by/2.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/ (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Zhang, Ye Willis, Erin Paul, Michael J Elhadad, Noémie Wallace, Byron C Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title | Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title_full | Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title_fullStr | Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title_full_unstemmed | Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title_short | Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach |
title_sort | characterizing the (perceived) newsworthiness of health science articles: a data-driven approach |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054236/ https://www.ncbi.nlm.nih.gov/pubmed/27658571 http://dx.doi.org/10.2196/medinform.5353 |
work_keys_str_mv | AT zhangye characterizingtheperceivednewsworthinessofhealthsciencearticlesadatadrivenapproach AT williserin characterizingtheperceivednewsworthinessofhealthsciencearticlesadatadrivenapproach AT paulmichaelj characterizingtheperceivednewsworthinessofhealthsciencearticlesadatadrivenapproach AT elhadadnoemie characterizingtheperceivednewsworthinessofhealthsciencearticlesadatadrivenapproach AT wallacebyronc characterizingtheperceivednewsworthinessofhealthsciencearticlesadatadrivenapproach |