Cargando…

How often do cancer researchers make their data and code available and what factors are associated with sharing?

BACKGROUND: Various stakeholders are calling for increased availability of data and code from cancer research. However, it is unclear how commonly these products are shared, and what factors are associated with sharing. Our objective was to evaluate how frequently oncology researchers make data and...

Descripción completa

Detalles Bibliográficos
Autores principales: Hamilton, Daniel G., Page, Matthew J., Finch, Sue, Everitt, Sarah, Fidler, Fiona
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646258/
https://www.ncbi.nlm.nih.gov/pubmed/36352426
http://dx.doi.org/10.1186/s12916-022-02644-2
_version_ 1784827123769278464
author Hamilton, Daniel G.
Page, Matthew J.
Finch, Sue
Everitt, Sarah
Fidler, Fiona
author_facet Hamilton, Daniel G.
Page, Matthew J.
Finch, Sue
Everitt, Sarah
Fidler, Fiona
author_sort Hamilton, Daniel G.
collection PubMed
description BACKGROUND: Various stakeholders are calling for increased availability of data and code from cancer research. However, it is unclear how commonly these products are shared, and what factors are associated with sharing. Our objective was to evaluate how frequently oncology researchers make data and code available and explore factors associated with sharing. METHODS: A cross-sectional analysis of a random sample of 306 cancer-related articles indexed in PubMed in 2019 which studied research subjects with a cancer diagnosis was performed. All articles were independently screened for eligibility by two authors. Outcomes of interest included the prevalence of affirmative sharing declarations and the rate with which declarations connected to data complying with key FAIR principles (e.g. posted to a recognised repository, assigned an identifier, data license outlined, non-proprietary formatting). We also investigated associations between sharing rates and several journal characteristics (e.g. sharing policies, publication models), study characteristics (e.g. cancer rarity, study design), open science practices (e.g. pre-registration, pre-printing) and subsequent citation rates between 2020 and 2021. RESULTS: One in five studies declared data were publicly available (59/306, 19%, 95% CI: 15–24%). However, when data availability was investigated this percentage dropped to 16% (49/306, 95% CI: 12–20%), and then to less than 1% (1/306, 95% CI: 0–2%) when data were checked for compliance with key FAIR principles. While only 4% of articles that used inferential statistics reported code to be available (10/274, 95% CI: 2–6%), the odds of reporting code to be available were 5.6 times higher for researchers who shared data. Compliance with mandatory data and code sharing policies was observed in 48% (14/29) and 0% (0/6) of articles, respectively. However, 88% of articles (45/51) included data availability statements when required. Policies that encouraged data sharing did not appear to be any more effective than not having a policy at all. The only factors associated with higher rates of data sharing were studying rare cancers and using publicly available data to complement original research. CONCLUSIONS: Data and code sharing in oncology occurs infrequently, and at a lower rate than would be expected given the prevalence of mandatory sharing policies. There is also a large gap between those declaring data to be available, and those archiving data in a way that facilitates its reuse. We encourage journals to actively check compliance with sharing policies, and researchers consult community-accepted guidelines when archiving the products of their research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12916-022-02644-2.
format Online
Article
Text
id pubmed-9646258
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-96462582022-11-14 How often do cancer researchers make their data and code available and what factors are associated with sharing? Hamilton, Daniel G. Page, Matthew J. Finch, Sue Everitt, Sarah Fidler, Fiona BMC Med Research Article BACKGROUND: Various stakeholders are calling for increased availability of data and code from cancer research. However, it is unclear how commonly these products are shared, and what factors are associated with sharing. Our objective was to evaluate how frequently oncology researchers make data and code available and explore factors associated with sharing. METHODS: A cross-sectional analysis of a random sample of 306 cancer-related articles indexed in PubMed in 2019 which studied research subjects with a cancer diagnosis was performed. All articles were independently screened for eligibility by two authors. Outcomes of interest included the prevalence of affirmative sharing declarations and the rate with which declarations connected to data complying with key FAIR principles (e.g. posted to a recognised repository, assigned an identifier, data license outlined, non-proprietary formatting). We also investigated associations between sharing rates and several journal characteristics (e.g. sharing policies, publication models), study characteristics (e.g. cancer rarity, study design), open science practices (e.g. pre-registration, pre-printing) and subsequent citation rates between 2020 and 2021. RESULTS: One in five studies declared data were publicly available (59/306, 19%, 95% CI: 15–24%). However, when data availability was investigated this percentage dropped to 16% (49/306, 95% CI: 12–20%), and then to less than 1% (1/306, 95% CI: 0–2%) when data were checked for compliance with key FAIR principles. While only 4% of articles that used inferential statistics reported code to be available (10/274, 95% CI: 2–6%), the odds of reporting code to be available were 5.6 times higher for researchers who shared data. Compliance with mandatory data and code sharing policies was observed in 48% (14/29) and 0% (0/6) of articles, respectively. However, 88% of articles (45/51) included data availability statements when required. Policies that encouraged data sharing did not appear to be any more effective than not having a policy at all. The only factors associated with higher rates of data sharing were studying rare cancers and using publicly available data to complement original research. CONCLUSIONS: Data and code sharing in oncology occurs infrequently, and at a lower rate than would be expected given the prevalence of mandatory sharing policies. There is also a large gap between those declaring data to be available, and those archiving data in a way that facilitates its reuse. We encourage journals to actively check compliance with sharing policies, and researchers consult community-accepted guidelines when archiving the products of their research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12916-022-02644-2. BioMed Central 2022-11-09 /pmc/articles/PMC9646258/ /pubmed/36352426 http://dx.doi.org/10.1186/s12916-022-02644-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Hamilton, Daniel G.
Page, Matthew J.
Finch, Sue
Everitt, Sarah
Fidler, Fiona
How often do cancer researchers make their data and code available and what factors are associated with sharing?
title How often do cancer researchers make their data and code available and what factors are associated with sharing?
title_full How often do cancer researchers make their data and code available and what factors are associated with sharing?
title_fullStr How often do cancer researchers make their data and code available and what factors are associated with sharing?
title_full_unstemmed How often do cancer researchers make their data and code available and what factors are associated with sharing?
title_short How often do cancer researchers make their data and code available and what factors are associated with sharing?
title_sort how often do cancer researchers make their data and code available and what factors are associated with sharing?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9646258/
https://www.ncbi.nlm.nih.gov/pubmed/36352426
http://dx.doi.org/10.1186/s12916-022-02644-2
work_keys_str_mv AT hamiltondanielg howoftendocancerresearchersmaketheirdataandcodeavailableandwhatfactorsareassociatedwithsharing
AT pagematthewj howoftendocancerresearchersmaketheirdataandcodeavailableandwhatfactorsareassociatedwithsharing
AT finchsue howoftendocancerresearchersmaketheirdataandcodeavailableandwhatfactorsareassociatedwithsharing
AT everittsarah howoftendocancerresearchersmaketheirdataandcodeavailableandwhatfactorsareassociatedwithsharing
AT fidlerfiona howoftendocancerresearchersmaketheirdataandcodeavailableandwhatfactorsareassociatedwithsharing