Cargando…

Quantifying deep neural network uncertainty for atrial fibrillation detection with limited labels

Atrial fibrillation (AF) is the most common arrhythmia found in the intensive care unit (ICU), and is associated with many adverse outcomes. Effective handling of AF and similar arrhythmias is a vital part of modern critical care, but obtaining knowledge about both disease burden and effective inter...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Brian, Javadi, Golara, Hamilton, Alexander, Sibley, Stephanie, Laird, Philip, Abolmaesumi, Purang, Maslove, David, Mousavi, Parvin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9684456/
https://www.ncbi.nlm.nih.gov/pubmed/36418604
http://dx.doi.org/10.1038/s41598-022-24574-y
Descripción
Sumario:Atrial fibrillation (AF) is the most common arrhythmia found in the intensive care unit (ICU), and is associated with many adverse outcomes. Effective handling of AF and similar arrhythmias is a vital part of modern critical care, but obtaining knowledge about both disease burden and effective interventions often requires costly clinical trials. A wealth of continuous, high frequency physiological data such as the waveforms derived from electrocardiogram telemetry are promising sources for enriching clinical research. Automated detection using machine learning and in particular deep learning has been explored as a solution for processing these data. However, a lack of labels, increased presence of noise, and inability to assess the quality and trustworthiness of many machine learning model predictions pose challenges to interpretation. In this work, we propose an approach for training deep AF models on limited, noisy data and report uncertainty in their predictions. Using techniques from the fields of weakly supervised learning, we leverage a surrogate model trained on non-ICU data to create imperfect labels for a large ICU telemetry dataset. We combine these weak labels with techniques to estimate model uncertainty without the need for extensive human data annotation. AF detection models trained using this process demonstrated higher classification performance (0.64–0.67 F1 score) and improved calibration (0.05–0.07 expected calibration error).