Cargando…

Robust identification of temporal biomarkers in longitudinal omics studies

MOTIVATION: Longitudinal studies increasingly collect rich ‘omics’ data sampled frequently over time and across large cohorts to capture dynamic health fluctuations and disease transitions. However, the generation of longitudinal omics data has preceded the development of analysis tools that can eff...

Descripción completa

Detalles Bibliográficos
Autores principales: Metwally, Ahmed A, Zhang, Tom, Wu, Si, Kellogg, Ryan, Zhou, Wenyu, Contrepois, Kevin, Tang, Hua, Snyder, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344853/
https://www.ncbi.nlm.nih.gov/pubmed/35762936
http://dx.doi.org/10.1093/bioinformatics/btac403
_version_ 1784761306352451584
author Metwally, Ahmed A
Zhang, Tom
Wu, Si
Kellogg, Ryan
Zhou, Wenyu
Contrepois, Kevin
Tang, Hua
Snyder, Michael
author_facet Metwally, Ahmed A
Zhang, Tom
Wu, Si
Kellogg, Ryan
Zhou, Wenyu
Contrepois, Kevin
Tang, Hua
Snyder, Michael
author_sort Metwally, Ahmed A
collection PubMed
description MOTIVATION: Longitudinal studies increasingly collect rich ‘omics’ data sampled frequently over time and across large cohorts to capture dynamic health fluctuations and disease transitions. However, the generation of longitudinal omics data has preceded the development of analysis tools that can efficiently extract insights from such data. In particular, there is a need for statistical frameworks that can identify not only which omics features are differentially regulated between groups but also over what time intervals. Additionally, longitudinal omics data may have inconsistencies, including non-uniform sampling intervals, missing data points, subject dropout and differing numbers of samples per subject. RESULTS: In this work, we developed OmicsLonDA, a statistical method that provides robust identification of time intervals of temporal omics biomarkers. OmicsLonDA is based on a semi-parametric approach, in which we use smoothing splines to model longitudinal data and infer significant time intervals of omics features based on an empirical distribution constructed through a permutation procedure. We benchmarked OmicsLonDA on five simulated datasets with diverse temporal patterns, and the method showed specificity greater than 0.99 and sensitivity greater than 0.87. Applying OmicsLonDA to the iPOP cohort revealed temporal patterns of genes, proteins, metabolites and microbes that are differentially regulated in male versus female subjects following a respiratory infection. In addition, we applied OmicsLonDA to a longitudinal multi-omics dataset of pregnant women with and without preeclampsia, and OmicsLonDA identified potential lipid markers that are temporally significantly different between the two groups. AVAILABILITY AND IMPLEMENTATION: We provide an open-source R package (https://bioconductor.org/packages/OmicsLonDA), to enable widespread use. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9344853
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-93448532022-08-03 Robust identification of temporal biomarkers in longitudinal omics studies Metwally, Ahmed A Zhang, Tom Wu, Si Kellogg, Ryan Zhou, Wenyu Contrepois, Kevin Tang, Hua Snyder, Michael Bioinformatics Original Papers MOTIVATION: Longitudinal studies increasingly collect rich ‘omics’ data sampled frequently over time and across large cohorts to capture dynamic health fluctuations and disease transitions. However, the generation of longitudinal omics data has preceded the development of analysis tools that can efficiently extract insights from such data. In particular, there is a need for statistical frameworks that can identify not only which omics features are differentially regulated between groups but also over what time intervals. Additionally, longitudinal omics data may have inconsistencies, including non-uniform sampling intervals, missing data points, subject dropout and differing numbers of samples per subject. RESULTS: In this work, we developed OmicsLonDA, a statistical method that provides robust identification of time intervals of temporal omics biomarkers. OmicsLonDA is based on a semi-parametric approach, in which we use smoothing splines to model longitudinal data and infer significant time intervals of omics features based on an empirical distribution constructed through a permutation procedure. We benchmarked OmicsLonDA on five simulated datasets with diverse temporal patterns, and the method showed specificity greater than 0.99 and sensitivity greater than 0.87. Applying OmicsLonDA to the iPOP cohort revealed temporal patterns of genes, proteins, metabolites and microbes that are differentially regulated in male versus female subjects following a respiratory infection. In addition, we applied OmicsLonDA to a longitudinal multi-omics dataset of pregnant women with and without preeclampsia, and OmicsLonDA identified potential lipid markers that are temporally significantly different between the two groups. AVAILABILITY AND IMPLEMENTATION: We provide an open-source R package (https://bioconductor.org/packages/OmicsLonDA), to enable widespread use. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-28 /pmc/articles/PMC9344853/ /pubmed/35762936 http://dx.doi.org/10.1093/bioinformatics/btac403 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Metwally, Ahmed A
Zhang, Tom
Wu, Si
Kellogg, Ryan
Zhou, Wenyu
Contrepois, Kevin
Tang, Hua
Snyder, Michael
Robust identification of temporal biomarkers in longitudinal omics studies
title Robust identification of temporal biomarkers in longitudinal omics studies
title_full Robust identification of temporal biomarkers in longitudinal omics studies
title_fullStr Robust identification of temporal biomarkers in longitudinal omics studies
title_full_unstemmed Robust identification of temporal biomarkers in longitudinal omics studies
title_short Robust identification of temporal biomarkers in longitudinal omics studies
title_sort robust identification of temporal biomarkers in longitudinal omics studies
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344853/
https://www.ncbi.nlm.nih.gov/pubmed/35762936
http://dx.doi.org/10.1093/bioinformatics/btac403
work_keys_str_mv AT metwallyahmeda robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT zhangtom robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT wusi robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT kelloggryan robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT zhouwenyu robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT contrepoiskevin robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT tanghua robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies
AT snydermichael robustidentificationoftemporalbiomarkersinlongitudinalomicsstudies