Cargando…

Association mapping in biomedical time series via statistically significant shapelet mining

MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from...

Descripción completa

Detalles Bibliográficos
Autores principales: Bock, Christian, Gumbsch, Thomas, Moor, Michael, Rieck, Bastian, Roqueiro, Damian, Borgwardt, Karsten
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022601/
https://www.ncbi.nlm.nih.gov/pubmed/29949972
http://dx.doi.org/10.1093/bioinformatics/bty246
_version_ 1783335713170784256
author Bock, Christian
Gumbsch, Thomas
Moor, Michael
Rieck, Bastian
Roqueiro, Damian
Borgwardt, Karsten
author_facet Bock, Christian
Gumbsch, Thomas
Moor, Michael
Rieck, Bastian
Roqueiro, Damian
Borgwardt, Karsten
author_sort Bock, Christian
collection PubMed
description MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from either poor predictive performance or weak explanatory power. Recent developments in time series classification focus on discovering shapelets, i.e. subsequences that are most predictive in terms of class membership. Shapelets have the advantage of combining a high predictive performance with an interpretable component—their shape. Currently, most shapelet discovery methods do not rely on statistical tests to verify the significance of individual shapelets. Therefore, identifying associations between the shapelets of physiological biomarkers and patients that exhibit certain phenotypes of interest enables the discovery and subsequent ranking of physiological signatures that are interpretable, statistically validated and accurate predictors of clinical endpoints. RESULTS: We present a novel and scalable method for scanning time series and identifying discriminative patterns that are statistically significant. The significance of a shapelet is evaluated while considering the problem of multiple hypothesis testing and mitigating it by efficiently pruning untestable shapelet candidates with Tarone’s method. We demonstrate the utility of our method by discovering patterns in three of a patient’s vital signs: heart rate, respiratory rate and systolic blood pressure that are indicators of the severity of a future sepsis event, i.e. an inflammatory response to an infective agent that can lead to organ failure and death, if not treated in time. AVAILABILITY AND IMPLEMENTATION: We make our method and the scripts that are required to reproduce the experiments publicly available at https://github.com/BorgwardtLab/S3M. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6022601
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60226012018-07-10 Association mapping in biomedical time series via statistically significant shapelet mining Bock, Christian Gumbsch, Thomas Moor, Michael Rieck, Bastian Roqueiro, Damian Borgwardt, Karsten Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from either poor predictive performance or weak explanatory power. Recent developments in time series classification focus on discovering shapelets, i.e. subsequences that are most predictive in terms of class membership. Shapelets have the advantage of combining a high predictive performance with an interpretable component—their shape. Currently, most shapelet discovery methods do not rely on statistical tests to verify the significance of individual shapelets. Therefore, identifying associations between the shapelets of physiological biomarkers and patients that exhibit certain phenotypes of interest enables the discovery and subsequent ranking of physiological signatures that are interpretable, statistically validated and accurate predictors of clinical endpoints. RESULTS: We present a novel and scalable method for scanning time series and identifying discriminative patterns that are statistically significant. The significance of a shapelet is evaluated while considering the problem of multiple hypothesis testing and mitigating it by efficiently pruning untestable shapelet candidates with Tarone’s method. We demonstrate the utility of our method by discovering patterns in three of a patient’s vital signs: heart rate, respiratory rate and systolic blood pressure that are indicators of the severity of a future sepsis event, i.e. an inflammatory response to an infective agent that can lead to organ failure and death, if not treated in time. AVAILABILITY AND IMPLEMENTATION: We make our method and the scripts that are required to reproduce the experiments publicly available at https://github.com/BorgwardtLab/S3M. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022601/ /pubmed/29949972 http://dx.doi.org/10.1093/bioinformatics/bty246 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
Bock, Christian
Gumbsch, Thomas
Moor, Michael
Rieck, Bastian
Roqueiro, Damian
Borgwardt, Karsten
Association mapping in biomedical time series via statistically significant shapelet mining
title Association mapping in biomedical time series via statistically significant shapelet mining
title_full Association mapping in biomedical time series via statistically significant shapelet mining
title_fullStr Association mapping in biomedical time series via statistically significant shapelet mining
title_full_unstemmed Association mapping in biomedical time series via statistically significant shapelet mining
title_short Association mapping in biomedical time series via statistically significant shapelet mining
title_sort association mapping in biomedical time series via statistically significant shapelet mining
topic Ismb 2018–Intelligent Systems for Molecular Biology Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022601/
https://www.ncbi.nlm.nih.gov/pubmed/29949972
http://dx.doi.org/10.1093/bioinformatics/bty246
work_keys_str_mv AT bockchristian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining
AT gumbschthomas associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining
AT moormichael associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining
AT rieckbastian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining
AT roqueirodamian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining
AT borgwardtkarsten associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining