Cargando…

Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions

BACKGROUND: Biological data suffers from noise that is inherent in the measurements. This is particularly true for time-series gene expression measurements. Nevertheless, in order to to explore cellular dynamics, scientists employ such noisy measurements in predictive and clustering tools. However,...

Descripción completa

Detalles Bibliográficos
Autores principales: Bar, Nadav, Nikparvar, Bahareh, Jayavelu, Naresh Doni, Roessler, Fabienne Krystin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364503/
https://www.ncbi.nlm.nih.gov/pubmed/35945515
http://dx.doi.org/10.1186/s12859-022-04839-z
_version_ 1784765157469061120
author Bar, Nadav
Nikparvar, Bahareh
Jayavelu, Naresh Doni
Roessler, Fabienne Krystin
author_facet Bar, Nadav
Nikparvar, Bahareh
Jayavelu, Naresh Doni
Roessler, Fabienne Krystin
author_sort Bar, Nadav
collection PubMed
description BACKGROUND: Biological data suffers from noise that is inherent in the measurements. This is particularly true for time-series gene expression measurements. Nevertheless, in order to to explore cellular dynamics, scientists employ such noisy measurements in predictive and clustering tools. However, noisy data can not only obscure the genes temporal patterns, but applying predictive and clustering tools on noisy data may yield inconsistent, and potentially incorrect, results. RESULTS: To reduce the noise of short-term (< 48 h) time-series expression data, we relied on the three basic temporal patterns of gene expression: waves, impulses and sustained responses. We constrained the estimation of the true signals to these patterns by estimating the parameters of first and second-order Fourier functions and using the nonlinear least-squares trust-region optimization technique. Our approach lowered the noise in at least 85% of synthetic time-series expression data, significantly more than the spline method ([Formula: see text] ). When the data contained a higher signal-to-noise ratio, our method allowed downstream network component analyses to calculate consistent and accurate predictions, particularly when the noise variance was high. Conversely, these tools led to erroneous results from untreated noisy data. Our results suggest that at least 5–7 time points are required to efficiently de-noise logarithmic scaled time-series expression data. Investing in sampling additional time points provides little benefit to clustering and prediction accuracy. CONCLUSIONS: Our constrained Fourier de-noising method helps to cluster noisy gene expression and interpret dynamic gene networks more accurately. The benefit of noise reduction is large and can constitute the difference between a successful application and a failing one.
format Online
Article
Text
id pubmed-9364503
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93645032022-08-11 Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions Bar, Nadav Nikparvar, Bahareh Jayavelu, Naresh Doni Roessler, Fabienne Krystin BMC Bioinformatics Research BACKGROUND: Biological data suffers from noise that is inherent in the measurements. This is particularly true for time-series gene expression measurements. Nevertheless, in order to to explore cellular dynamics, scientists employ such noisy measurements in predictive and clustering tools. However, noisy data can not only obscure the genes temporal patterns, but applying predictive and clustering tools on noisy data may yield inconsistent, and potentially incorrect, results. RESULTS: To reduce the noise of short-term (< 48 h) time-series expression data, we relied on the three basic temporal patterns of gene expression: waves, impulses and sustained responses. We constrained the estimation of the true signals to these patterns by estimating the parameters of first and second-order Fourier functions and using the nonlinear least-squares trust-region optimization technique. Our approach lowered the noise in at least 85% of synthetic time-series expression data, significantly more than the spline method ([Formula: see text] ). When the data contained a higher signal-to-noise ratio, our method allowed downstream network component analyses to calculate consistent and accurate predictions, particularly when the noise variance was high. Conversely, these tools led to erroneous results from untreated noisy data. Our results suggest that at least 5–7 time points are required to efficiently de-noise logarithmic scaled time-series expression data. Investing in sampling additional time points provides little benefit to clustering and prediction accuracy. CONCLUSIONS: Our constrained Fourier de-noising method helps to cluster noisy gene expression and interpret dynamic gene networks more accurately. The benefit of noise reduction is large and can constitute the difference between a successful application and a failing one. BioMed Central 2022-08-09 /pmc/articles/PMC9364503/ /pubmed/35945515 http://dx.doi.org/10.1186/s12859-022-04839-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Bar, Nadav
Nikparvar, Bahareh
Jayavelu, Naresh Doni
Roessler, Fabienne Krystin
Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title_full Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title_fullStr Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title_full_unstemmed Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title_short Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
title_sort constrained fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364503/
https://www.ncbi.nlm.nih.gov/pubmed/35945515
http://dx.doi.org/10.1186/s12859-022-04839-z
work_keys_str_mv AT barnadav constrainedfourierestimationofshorttermtimeseriesgeneexpressiondatareducesnoiseandimprovesclusteringandgeneregulatorynetworkpredictions
AT nikparvarbahareh constrainedfourierestimationofshorttermtimeseriesgeneexpressiondatareducesnoiseandimprovesclusteringandgeneregulatorynetworkpredictions
AT jayavelunareshdoni constrainedfourierestimationofshorttermtimeseriesgeneexpressiondatareducesnoiseandimprovesclusteringandgeneregulatorynetworkpredictions
AT roesslerfabiennekrystin constrainedfourierestimationofshorttermtimeseriesgeneexpressiondatareducesnoiseandimprovesclusteringandgeneregulatorynetworkpredictions