Cargando…

An accurate probabilistic step finder for time-series analysis

Noisy time-series data is commonly collected from sources including Förster Resonance Energy Transfer experiments, patch clamp and force spectroscopy setups, among many others. Two of the most common paradigms for the detection of discrete transitions in such time-series data include: hidden Markov...

Descripción completa

Detalles Bibliográficos
Autores principales: Rojewski, Alex, Schweiger, Maxwell, Sgouralis, Ioannis, Comstock, Matthew, Pressé, Steve
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541599/
https://www.ncbi.nlm.nih.gov/pubmed/37786687
http://dx.doi.org/10.1101/2023.09.19.558535
_version_ 1785113932813303808
author Rojewski, Alex
Schweiger, Maxwell
Sgouralis, Ioannis
Comstock, Matthew
Pressé, Steve
author_facet Rojewski, Alex
Schweiger, Maxwell
Sgouralis, Ioannis
Comstock, Matthew
Pressé, Steve
author_sort Rojewski, Alex
collection PubMed
description Noisy time-series data is commonly collected from sources including Förster Resonance Energy Transfer experiments, patch clamp and force spectroscopy setups, among many others. Two of the most common paradigms for the detection of discrete transitions in such time-series data include: hidden Markov models (HMMs) and step-finding algorithms. HMMs, including their extensions to infinite state-spaces, inherently assume in analysis that holding times in discrete states visited are geometrically–or, loosely speaking in common language, exponentially–distributed. Thus the determination of step locations, especially in sparse and noisy data, is biased by HMMs toward identifying steps resulting in geometric holding times. In contrast, existing step-finding algorithms, while free of this restraint, often rely on ad hoc metrics to penalize steps recovered in time traces (by using various information criteria) and otherwise rely on approximate greedy algorithms to identify putative global optima. Here, instead, we devise a robust and general probabilistic (Bayesian) step-finding tool that neither relies on ad hoc metrics to penalize step numbers nor assumes geometric holding times in each state. As the number of steps themselves in a time-series are, a priori unknown, we treat these within a Bayesian nonparametric (BNP) paradigm. We find that the method developed, Bayesian Nonparametric Step (BNP-Step), accurately determines the number and location of transitions between discrete states without any assumed kinetic model and learns the emission distribution characteristic of each state. In doing so, we verify that BNP-Step can analyze sparser data sets containing higher noise and more closely-spaced states than otherwise resolved by current state-of-the-art methods. What is more, BNP-Step rigorously propagates measurement uncertainty into uncertainty over state transition locations, numbers, and emission levels as characterized by the posterior. We demonstrate the performance of BNP-Step on both synthetic data as well as data drawn from force spectroscopy experiments.
format Online
Article
Text
id pubmed-10541599
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-105415992023-10-02 An accurate probabilistic step finder for time-series analysis Rojewski, Alex Schweiger, Maxwell Sgouralis, Ioannis Comstock, Matthew Pressé, Steve bioRxiv Article Noisy time-series data is commonly collected from sources including Förster Resonance Energy Transfer experiments, patch clamp and force spectroscopy setups, among many others. Two of the most common paradigms for the detection of discrete transitions in such time-series data include: hidden Markov models (HMMs) and step-finding algorithms. HMMs, including their extensions to infinite state-spaces, inherently assume in analysis that holding times in discrete states visited are geometrically–or, loosely speaking in common language, exponentially–distributed. Thus the determination of step locations, especially in sparse and noisy data, is biased by HMMs toward identifying steps resulting in geometric holding times. In contrast, existing step-finding algorithms, while free of this restraint, often rely on ad hoc metrics to penalize steps recovered in time traces (by using various information criteria) and otherwise rely on approximate greedy algorithms to identify putative global optima. Here, instead, we devise a robust and general probabilistic (Bayesian) step-finding tool that neither relies on ad hoc metrics to penalize step numbers nor assumes geometric holding times in each state. As the number of steps themselves in a time-series are, a priori unknown, we treat these within a Bayesian nonparametric (BNP) paradigm. We find that the method developed, Bayesian Nonparametric Step (BNP-Step), accurately determines the number and location of transitions between discrete states without any assumed kinetic model and learns the emission distribution characteristic of each state. In doing so, we verify that BNP-Step can analyze sparser data sets containing higher noise and more closely-spaced states than otherwise resolved by current state-of-the-art methods. What is more, BNP-Step rigorously propagates measurement uncertainty into uncertainty over state transition locations, numbers, and emission levels as characterized by the posterior. We demonstrate the performance of BNP-Step on both synthetic data as well as data drawn from force spectroscopy experiments. Cold Spring Harbor Laboratory 2023-09-22 /pmc/articles/PMC10541599/ /pubmed/37786687 http://dx.doi.org/10.1101/2023.09.19.558535 Text en https://creativecommons.org/licenses/by-nc/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Rojewski, Alex
Schweiger, Maxwell
Sgouralis, Ioannis
Comstock, Matthew
Pressé, Steve
An accurate probabilistic step finder for time-series analysis
title An accurate probabilistic step finder for time-series analysis
title_full An accurate probabilistic step finder for time-series analysis
title_fullStr An accurate probabilistic step finder for time-series analysis
title_full_unstemmed An accurate probabilistic step finder for time-series analysis
title_short An accurate probabilistic step finder for time-series analysis
title_sort accurate probabilistic step finder for time-series analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541599/
https://www.ncbi.nlm.nih.gov/pubmed/37786687
http://dx.doi.org/10.1101/2023.09.19.558535
work_keys_str_mv AT rojewskialex anaccurateprobabilisticstepfinderfortimeseriesanalysis
AT schweigermaxwell anaccurateprobabilisticstepfinderfortimeseriesanalysis
AT sgouralisioannis anaccurateprobabilisticstepfinderfortimeseriesanalysis
AT comstockmatthew anaccurateprobabilisticstepfinderfortimeseriesanalysis
AT pressesteve anaccurateprobabilisticstepfinderfortimeseriesanalysis
AT rojewskialex accurateprobabilisticstepfinderfortimeseriesanalysis
AT schweigermaxwell accurateprobabilisticstepfinderfortimeseriesanalysis
AT sgouralisioannis accurateprobabilisticstepfinderfortimeseriesanalysis
AT comstockmatthew accurateprobabilisticstepfinderfortimeseriesanalysis
AT pressesteve accurateprobabilisticstepfinderfortimeseriesanalysis