Cargando…

Simplivariate Models: Ideas and First Examples

One of the new expanding areas in functional genomics is metabolomics: measuring the metabolome of an organism. Data being generated in metabolomics studies are very diverse in nature depending on the design underlying the experiment. Traditionally, variation in measurements is conceptually broken d...

Descripción completa

Detalles Bibliográficos
Autores principales: Hageman, Jos A., Hendriks, Margriet M. W. B., Westerhuis, Johan A., van der Werf, Mariët J., Berger, Ruud, Smilde, Age K.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533398/
https://www.ncbi.nlm.nih.gov/pubmed/18810272
http://dx.doi.org/10.1371/journal.pone.0003259
_version_ 1782159039822561280
author Hageman, Jos A.
Hendriks, Margriet M. W. B.
Westerhuis, Johan A.
van der Werf, Mariët J.
Berger, Ruud
Smilde, Age K.
author_facet Hageman, Jos A.
Hendriks, Margriet M. W. B.
Westerhuis, Johan A.
van der Werf, Mariët J.
Berger, Ruud
Smilde, Age K.
author_sort Hageman, Jos A.
collection PubMed
description One of the new expanding areas in functional genomics is metabolomics: measuring the metabolome of an organism. Data being generated in metabolomics studies are very diverse in nature depending on the design underlying the experiment. Traditionally, variation in measurements is conceptually broken down in systematic variation and noise where the latter contains, e.g. technical variation. There is increasing evidence that this distinction does not hold (or is too simple) for metabolomics data. A more useful distinction is in terms of informative and non-informative variation where informative relates to the problem being studied. In most common methods for analyzing metabolomics (or any other high-dimensional x-omics) data this distinction is ignored thereby severely hampering the results of the analysis. This leads to poorly interpretable models and may even obscure the relevant biological information. We developed a framework from first data analysis principles by explicitly formulating the problem of analyzing metabolomics data in terms of informative and non-informative parts. This framework allows for flexible interactions with the biologists involved in formulating prior knowledge of underlying structures. The basic idea is that the informative parts of the complex metabolomics data are approximated by simple components with a biological meaning, e.g. in terms of metabolic pathways or their regulation. Hence, we termed the framework ‘simplivariate models’ which constitutes a new way of looking at metabolomics data. The framework is given in its full generality and exemplified with two methods, IDR analysis and plaid modeling, that fit into the framework. Using this strategy of ‘divide and conquer’, we show that meaningful simplivariate models can be obtained using a real-life microbial metabolomics data set. For instance, one of the simple components contained all the measured intermediates of the Krebs cycle of E. coli. Moreover, these simplivariate models were able to uncover regulatory mechanisms present in the phenylalanine biosynthesis route of E. coli.
format Text
id pubmed-2533398
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-25333982008-09-23 Simplivariate Models: Ideas and First Examples Hageman, Jos A. Hendriks, Margriet M. W. B. Westerhuis, Johan A. van der Werf, Mariët J. Berger, Ruud Smilde, Age K. PLoS One Research Article One of the new expanding areas in functional genomics is metabolomics: measuring the metabolome of an organism. Data being generated in metabolomics studies are very diverse in nature depending on the design underlying the experiment. Traditionally, variation in measurements is conceptually broken down in systematic variation and noise where the latter contains, e.g. technical variation. There is increasing evidence that this distinction does not hold (or is too simple) for metabolomics data. A more useful distinction is in terms of informative and non-informative variation where informative relates to the problem being studied. In most common methods for analyzing metabolomics (or any other high-dimensional x-omics) data this distinction is ignored thereby severely hampering the results of the analysis. This leads to poorly interpretable models and may even obscure the relevant biological information. We developed a framework from first data analysis principles by explicitly formulating the problem of analyzing metabolomics data in terms of informative and non-informative parts. This framework allows for flexible interactions with the biologists involved in formulating prior knowledge of underlying structures. The basic idea is that the informative parts of the complex metabolomics data are approximated by simple components with a biological meaning, e.g. in terms of metabolic pathways or their regulation. Hence, we termed the framework ‘simplivariate models’ which constitutes a new way of looking at metabolomics data. The framework is given in its full generality and exemplified with two methods, IDR analysis and plaid modeling, that fit into the framework. Using this strategy of ‘divide and conquer’, we show that meaningful simplivariate models can be obtained using a real-life microbial metabolomics data set. For instance, one of the simple components contained all the measured intermediates of the Krebs cycle of E. coli. Moreover, these simplivariate models were able to uncover regulatory mechanisms present in the phenylalanine biosynthesis route of E. coli. Public Library of Science 2008-09-23 /pmc/articles/PMC2533398/ /pubmed/18810272 http://dx.doi.org/10.1371/journal.pone.0003259 Text en Hageman et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hageman, Jos A.
Hendriks, Margriet M. W. B.
Westerhuis, Johan A.
van der Werf, Mariët J.
Berger, Ruud
Smilde, Age K.
Simplivariate Models: Ideas and First Examples
title Simplivariate Models: Ideas and First Examples
title_full Simplivariate Models: Ideas and First Examples
title_fullStr Simplivariate Models: Ideas and First Examples
title_full_unstemmed Simplivariate Models: Ideas and First Examples
title_short Simplivariate Models: Ideas and First Examples
title_sort simplivariate models: ideas and first examples
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533398/
https://www.ncbi.nlm.nih.gov/pubmed/18810272
http://dx.doi.org/10.1371/journal.pone.0003259
work_keys_str_mv AT hagemanjosa simplivariatemodelsideasandfirstexamples
AT hendriksmargrietmwb simplivariatemodelsideasandfirstexamples
AT westerhuisjohana simplivariatemodelsideasandfirstexamples
AT vanderwerfmarietj simplivariatemodelsideasandfirstexamples
AT bergerruud simplivariatemodelsideasandfirstexamples
AT smildeagek simplivariatemodelsideasandfirstexamples