Cargando…
Association mapping in biomedical time series via statistically significant shapelet mining
MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022601/ https://www.ncbi.nlm.nih.gov/pubmed/29949972 http://dx.doi.org/10.1093/bioinformatics/bty246 |
_version_ | 1783335713170784256 |
---|---|
author | Bock, Christian Gumbsch, Thomas Moor, Michael Rieck, Bastian Roqueiro, Damian Borgwardt, Karsten |
author_facet | Bock, Christian Gumbsch, Thomas Moor, Michael Rieck, Bastian Roqueiro, Damian Borgwardt, Karsten |
author_sort | Bock, Christian |
collection | PubMed |
description | MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from either poor predictive performance or weak explanatory power. Recent developments in time series classification focus on discovering shapelets, i.e. subsequences that are most predictive in terms of class membership. Shapelets have the advantage of combining a high predictive performance with an interpretable component—their shape. Currently, most shapelet discovery methods do not rely on statistical tests to verify the significance of individual shapelets. Therefore, identifying associations between the shapelets of physiological biomarkers and patients that exhibit certain phenotypes of interest enables the discovery and subsequent ranking of physiological signatures that are interpretable, statistically validated and accurate predictors of clinical endpoints. RESULTS: We present a novel and scalable method for scanning time series and identifying discriminative patterns that are statistically significant. The significance of a shapelet is evaluated while considering the problem of multiple hypothesis testing and mitigating it by efficiently pruning untestable shapelet candidates with Tarone’s method. We demonstrate the utility of our method by discovering patterns in three of a patient’s vital signs: heart rate, respiratory rate and systolic blood pressure that are indicators of the severity of a future sepsis event, i.e. an inflammatory response to an infective agent that can lead to organ failure and death, if not treated in time. AVAILABILITY AND IMPLEMENTATION: We make our method and the scripts that are required to reproduce the experiments publicly available at https://github.com/BorgwardtLab/S3M. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6022601 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-60226012018-07-10 Association mapping in biomedical time series via statistically significant shapelet mining Bock, Christian Gumbsch, Thomas Moor, Michael Rieck, Bastian Roqueiro, Damian Borgwardt, Karsten Bioinformatics Ismb 2018–Intelligent Systems for Molecular Biology Proceedings MOTIVATION: Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from either poor predictive performance or weak explanatory power. Recent developments in time series classification focus on discovering shapelets, i.e. subsequences that are most predictive in terms of class membership. Shapelets have the advantage of combining a high predictive performance with an interpretable component—their shape. Currently, most shapelet discovery methods do not rely on statistical tests to verify the significance of individual shapelets. Therefore, identifying associations between the shapelets of physiological biomarkers and patients that exhibit certain phenotypes of interest enables the discovery and subsequent ranking of physiological signatures that are interpretable, statistically validated and accurate predictors of clinical endpoints. RESULTS: We present a novel and scalable method for scanning time series and identifying discriminative patterns that are statistically significant. The significance of a shapelet is evaluated while considering the problem of multiple hypothesis testing and mitigating it by efficiently pruning untestable shapelet candidates with Tarone’s method. We demonstrate the utility of our method by discovering patterns in three of a patient’s vital signs: heart rate, respiratory rate and systolic blood pressure that are indicators of the severity of a future sepsis event, i.e. an inflammatory response to an infective agent that can lead to organ failure and death, if not treated in time. AVAILABILITY AND IMPLEMENTATION: We make our method and the scripts that are required to reproduce the experiments publicly available at https://github.com/BorgwardtLab/S3M. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-07-01 2018-06-27 /pmc/articles/PMC6022601/ /pubmed/29949972 http://dx.doi.org/10.1093/bioinformatics/bty246 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2018–Intelligent Systems for Molecular Biology Proceedings Bock, Christian Gumbsch, Thomas Moor, Michael Rieck, Bastian Roqueiro, Damian Borgwardt, Karsten Association mapping in biomedical time series via statistically significant shapelet mining |
title | Association mapping in biomedical time series via statistically significant shapelet mining |
title_full | Association mapping in biomedical time series via statistically significant shapelet mining |
title_fullStr | Association mapping in biomedical time series via statistically significant shapelet mining |
title_full_unstemmed | Association mapping in biomedical time series via statistically significant shapelet mining |
title_short | Association mapping in biomedical time series via statistically significant shapelet mining |
title_sort | association mapping in biomedical time series via statistically significant shapelet mining |
topic | Ismb 2018–Intelligent Systems for Molecular Biology Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6022601/ https://www.ncbi.nlm.nih.gov/pubmed/29949972 http://dx.doi.org/10.1093/bioinformatics/bty246 |
work_keys_str_mv | AT bockchristian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining AT gumbschthomas associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining AT moormichael associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining AT rieckbastian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining AT roqueirodamian associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining AT borgwardtkarsten associationmappinginbiomedicaltimeseriesviastatisticallysignificantshapeletmining |