Cargando…
Developing a standardized but extendable framework to increase the findability of infectious disease datasets
Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, enc...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9950378/ https://www.ncbi.nlm.nih.gov/pubmed/36823157 http://dx.doi.org/10.1038/s41597-023-01968-9 |
_version_ | 1784893150055104512 |
---|---|
author | Tsueng, Ginger Cano, Marco A. Alvarado Bento, José Czech, Candice Kang, Mengjia Pache, Lars Rasmussen, Luke V. Savidge, Tor C. Starren, Justin Wu, Qinglong Xin, Jiwen Yeaman, Michael R. Zhou, Xinghua Su, Andrew I. Wu, Chunlei Brown, Liliana Shabman, Reed S. Hughes, Laura D. |
author_facet | Tsueng, Ginger Cano, Marco A. Alvarado Bento, José Czech, Candice Kang, Mengjia Pache, Lars Rasmussen, Luke V. Savidge, Tor C. Starren, Justin Wu, Qinglong Xin, Jiwen Yeaman, Michael R. Zhou, Xinghua Su, Andrew I. Wu, Chunlei Brown, Liliana Shabman, Reed S. Hughes, Laura D. |
author_sort | Tsueng, Ginger |
collection | PubMed |
description | Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability. |
format | Online Article Text |
id | pubmed-9950378 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-99503782023-02-25 Developing a standardized but extendable framework to increase the findability of infectious disease datasets Tsueng, Ginger Cano, Marco A. Alvarado Bento, José Czech, Candice Kang, Mengjia Pache, Lars Rasmussen, Luke V. Savidge, Tor C. Starren, Justin Wu, Qinglong Xin, Jiwen Yeaman, Michael R. Zhou, Xinghua Su, Andrew I. Wu, Chunlei Brown, Liliana Shabman, Reed S. Hughes, Laura D. Sci Data Article Biomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability. Nature Publishing Group UK 2023-02-23 /pmc/articles/PMC9950378/ /pubmed/36823157 http://dx.doi.org/10.1038/s41597-023-01968-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tsueng, Ginger Cano, Marco A. Alvarado Bento, José Czech, Candice Kang, Mengjia Pache, Lars Rasmussen, Luke V. Savidge, Tor C. Starren, Justin Wu, Qinglong Xin, Jiwen Yeaman, Michael R. Zhou, Xinghua Su, Andrew I. Wu, Chunlei Brown, Liliana Shabman, Reed S. Hughes, Laura D. Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title | Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title_full | Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title_fullStr | Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title_full_unstemmed | Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title_short | Developing a standardized but extendable framework to increase the findability of infectious disease datasets |
title_sort | developing a standardized but extendable framework to increase the findability of infectious disease datasets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9950378/ https://www.ncbi.nlm.nih.gov/pubmed/36823157 http://dx.doi.org/10.1038/s41597-023-01968-9 |
work_keys_str_mv | AT tsuengginger developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT canomarcoaalvarado developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT bentojose developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT czechcandice developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT kangmengjia developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT pachelars developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT rasmussenlukev developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT savidgetorc developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT starrenjustin developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT wuqinglong developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT xinjiwen developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT yeamanmichaelr developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT zhouxinghua developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT suandrewi developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT wuchunlei developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT brownliliana developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT shabmanreeds developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT hugheslaurad developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets AT developingastandardizedbutextendableframeworktoincreasethefindabilityofinfectiousdiseasedatasets |