Cargando…

A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example

Heart failure (HF) affects at least 26 million people worldwide, so predicting adverse events in HF patients represents a major target of clinical data science. However, achieving large sample sizes sometimes represents a challenge due to difficulties in patient recruiting and long follow-up times,...

Descripción completa

Detalles Bibliográficos
Autores principales: Fassina, Lorenzo, Faragli, Alessandro, Lo Muzio, Francesco Paolo, Kelle, Sebastian, Campana, Carlo, Pieske, Burkert, Edelmann, Frank, Alogna, Alessio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7714902/
https://www.ncbi.nlm.nih.gov/pubmed/33330661
http://dx.doi.org/10.3389/fcvm.2020.599923
_version_ 1783618829402767360
author Fassina, Lorenzo
Faragli, Alessandro
Lo Muzio, Francesco Paolo
Kelle, Sebastian
Campana, Carlo
Pieske, Burkert
Edelmann, Frank
Alogna, Alessio
author_facet Fassina, Lorenzo
Faragli, Alessandro
Lo Muzio, Francesco Paolo
Kelle, Sebastian
Campana, Carlo
Pieske, Burkert
Edelmann, Frank
Alogna, Alessio
author_sort Fassina, Lorenzo
collection PubMed
description Heart failure (HF) affects at least 26 million people worldwide, so predicting adverse events in HF patients represents a major target of clinical data science. However, achieving large sample sizes sometimes represents a challenge due to difficulties in patient recruiting and long follow-up times, increasing the problem of missing data. To overcome the issue of a narrow dataset cardinality (in a clinical dataset, the cardinality is the number of patients in that dataset), population-enhancing algorithms are therefore crucial. The aim of this study was to design a random shuffle method to enhance the cardinality of an HF dataset while it is statistically legitimate, without the need of specific hypotheses and regression models. The cardinality enhancement was validated against an established random repeated-measures method with regard to the correctness in predicting clinical conditions and endpoints. In particular, machine learning and regression models were employed to highlight the benefits of the enhanced datasets. The proposed random shuffle method was able to enhance the HF dataset cardinality (711 patients before dataset preprocessing) circa 10 times and circa 21 times when followed by a random repeated-measures approach. We believe that the random shuffle method could be used in the cardiovascular field and in other data science problems when missing data and the narrow dataset cardinality represent an issue.
format Online
Article
Text
id pubmed-7714902
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-77149022020-12-15 A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example Fassina, Lorenzo Faragli, Alessandro Lo Muzio, Francesco Paolo Kelle, Sebastian Campana, Carlo Pieske, Burkert Edelmann, Frank Alogna, Alessio Front Cardiovasc Med Cardiovascular Medicine Heart failure (HF) affects at least 26 million people worldwide, so predicting adverse events in HF patients represents a major target of clinical data science. However, achieving large sample sizes sometimes represents a challenge due to difficulties in patient recruiting and long follow-up times, increasing the problem of missing data. To overcome the issue of a narrow dataset cardinality (in a clinical dataset, the cardinality is the number of patients in that dataset), population-enhancing algorithms are therefore crucial. The aim of this study was to design a random shuffle method to enhance the cardinality of an HF dataset while it is statistically legitimate, without the need of specific hypotheses and regression models. The cardinality enhancement was validated against an established random repeated-measures method with regard to the correctness in predicting clinical conditions and endpoints. In particular, machine learning and regression models were employed to highlight the benefits of the enhanced datasets. The proposed random shuffle method was able to enhance the HF dataset cardinality (711 patients before dataset preprocessing) circa 10 times and circa 21 times when followed by a random repeated-measures approach. We believe that the random shuffle method could be used in the cardiovascular field and in other data science problems when missing data and the narrow dataset cardinality represent an issue. Frontiers Media S.A. 2020-11-20 /pmc/articles/PMC7714902/ /pubmed/33330661 http://dx.doi.org/10.3389/fcvm.2020.599923 Text en Copyright © 2020 Fassina, Faragli, Lo Muzio, Kelle, Campana, Pieske, Edelmann and Alogna. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cardiovascular Medicine
Fassina, Lorenzo
Faragli, Alessandro
Lo Muzio, Francesco Paolo
Kelle, Sebastian
Campana, Carlo
Pieske, Burkert
Edelmann, Frank
Alogna, Alessio
A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title_full A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title_fullStr A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title_full_unstemmed A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title_short A Random Shuffle Method to Expand a Narrow Dataset and Overcome the Associated Challenges in a Clinical Study: A Heart Failure Cohort Example
title_sort random shuffle method to expand a narrow dataset and overcome the associated challenges in a clinical study: a heart failure cohort example
topic Cardiovascular Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7714902/
https://www.ncbi.nlm.nih.gov/pubmed/33330661
http://dx.doi.org/10.3389/fcvm.2020.599923
work_keys_str_mv AT fassinalorenzo arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT faraglialessandro arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT lomuziofrancescopaolo arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT kellesebastian arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT campanacarlo arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT pieskeburkert arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT edelmannfrank arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT alognaalessio arandomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT fassinalorenzo randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT faraglialessandro randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT lomuziofrancescopaolo randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT kellesebastian randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT campanacarlo randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT pieskeburkert randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT edelmannfrank randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample
AT alognaalessio randomshufflemethodtoexpandanarrowdatasetandovercometheassociatedchallengesinaclinicalstudyaheartfailurecohortexample