Cargando…

ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system

Numerous biological systems oscillate over time or space. Despite these oscillators’ importance, data from an oscillatory system is problematic for existing methods of regularized supervised learning. We present ZeitZeiger, a method to predict a periodic variable (e.g. time of day) from a high-dimen...

Descripción completa

Detalles Bibliográficos
Autores principales: Hughey, Jacob J., Hastie, Trevor, Butte, Atul J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4856978/
https://www.ncbi.nlm.nih.gov/pubmed/26819407
http://dx.doi.org/10.1093/nar/gkw030
_version_ 1782430575948201984
author Hughey, Jacob J.
Hastie, Trevor
Butte, Atul J.
author_facet Hughey, Jacob J.
Hastie, Trevor
Butte, Atul J.
author_sort Hughey, Jacob J.
collection PubMed
description Numerous biological systems oscillate over time or space. Despite these oscillators’ importance, data from an oscillatory system is problematic for existing methods of regularized supervised learning. We present ZeitZeiger, a method to predict a periodic variable (e.g. time of day) from a high-dimensional observation. ZeitZeiger learns a sparse representation of the variation associated with the periodic variable in the training observations, then uses maximum-likelihood to make a prediction for a test observation. We applied ZeitZeiger to a comprehensive dataset of genome-wide gene expression from the mammalian circadian oscillator. Using the expression of 13 genes, ZeitZeiger predicted circadian time (internal time of day) in each of 12 mouse organs to within ∼1 h, resulting in a multi-organ predictor of circadian time. Compared to the state-of-the-art approach, ZeitZeiger was faster, more accurate and used fewer genes. We then validated the multi-organ predictor on 20 additional datasets comprising nearly 800 samples. Our results suggest that ZeitZeiger not only makes accurate predictions, but also gives insight into the behavior and structure of the oscillator from which the data originated. As our ability to collect high-dimensional data from various biological oscillators increases, ZeitZeiger should enhance efforts to convert these data to knowledge.
format Online
Article
Text
id pubmed-4856978
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-48569782016-05-09 ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system Hughey, Jacob J. Hastie, Trevor Butte, Atul J. Nucleic Acids Res Methods Online Numerous biological systems oscillate over time or space. Despite these oscillators’ importance, data from an oscillatory system is problematic for existing methods of regularized supervised learning. We present ZeitZeiger, a method to predict a periodic variable (e.g. time of day) from a high-dimensional observation. ZeitZeiger learns a sparse representation of the variation associated with the periodic variable in the training observations, then uses maximum-likelihood to make a prediction for a test observation. We applied ZeitZeiger to a comprehensive dataset of genome-wide gene expression from the mammalian circadian oscillator. Using the expression of 13 genes, ZeitZeiger predicted circadian time (internal time of day) in each of 12 mouse organs to within ∼1 h, resulting in a multi-organ predictor of circadian time. Compared to the state-of-the-art approach, ZeitZeiger was faster, more accurate and used fewer genes. We then validated the multi-organ predictor on 20 additional datasets comprising nearly 800 samples. Our results suggest that ZeitZeiger not only makes accurate predictions, but also gives insight into the behavior and structure of the oscillator from which the data originated. As our ability to collect high-dimensional data from various biological oscillators increases, ZeitZeiger should enhance efforts to convert these data to knowledge. Oxford University Press 2016-05-05 2016-01-26 /pmc/articles/PMC4856978/ /pubmed/26819407 http://dx.doi.org/10.1093/nar/gkw030 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Hughey, Jacob J.
Hastie, Trevor
Butte, Atul J.
ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title_full ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title_fullStr ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title_full_unstemmed ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title_short ZeitZeiger: supervised learning for high-dimensional data from an oscillatory system
title_sort zeitzeiger: supervised learning for high-dimensional data from an oscillatory system
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4856978/
https://www.ncbi.nlm.nih.gov/pubmed/26819407
http://dx.doi.org/10.1093/nar/gkw030
work_keys_str_mv AT hugheyjacobj zeitzeigersupervisedlearningforhighdimensionaldatafromanoscillatorysystem
AT hastietrevor zeitzeigersupervisedlearningforhighdimensionaldatafromanoscillatorysystem
AT butteatulj zeitzeigersupervisedlearningforhighdimensionaldatafromanoscillatorysystem