Cargando…

MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics

BACKGROUND: Interpreting non-targeted metabolomics data remains a challenging task. Signals from non-targeted metabolomics studies stem from a combination of biological causes, complex interactions between them and experimental bias/noise. The resulting data matrix usually contain huge number of var...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Youzhong, Smirnov, Kirill, Lucio, Marianna, Gougeon, Régis D., Alexandre, Hervé, Schmitt-Kopplin, Philippe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4776428/
https://www.ncbi.nlm.nih.gov/pubmed/26936354
http://dx.doi.org/10.1186/s12859-016-0970-4
_version_ 1782419154527059968
author Liu, Youzhong
Smirnov, Kirill
Lucio, Marianna
Gougeon, Régis D.
Alexandre, Hervé
Schmitt-Kopplin, Philippe
author_facet Liu, Youzhong
Smirnov, Kirill
Lucio, Marianna
Gougeon, Régis D.
Alexandre, Hervé
Schmitt-Kopplin, Philippe
author_sort Liu, Youzhong
collection PubMed
description BACKGROUND: Interpreting non-targeted metabolomics data remains a challenging task. Signals from non-targeted metabolomics studies stem from a combination of biological causes, complex interactions between them and experimental bias/noise. The resulting data matrix usually contain huge number of variables and only few samples, and classical techniques using nonlinear mapping could result in computational complexity and overfitting. Independent Component Analysis (ICA) as a linear method could potentially bring more meaningful results than Principal Component Analysis (PCA). However, a major problem with most ICA algorithms is the output variations between different runs and the result of a single ICA run should be interpreted with reserve. RESULTS: ICA was applied to simulated and experimental mass spectrometry (MS)-based non-targeted metabolomics data, under the hypothesis that underlying sources are mutually independent. Inspired from the Icasso algorithm, a new ICA method, MetICA was developed to handle the instability of ICA on complex datasets. Like the original Icasso algorithm, MetICA evaluated the algorithmic and statistical reliability of ICA runs. In addition, MetICA suggests two ways to select the optimal number of model components and gives an order of interpretation for the components obtained. CONCLUSIONS: Correlating the components obtained with prior biological knowledge allows understanding how non-targeted metabolomics data reflect biological nature and technical phenomena. We could also extract mass signals related to this information. This novel approach provides meaningful components due to their independent nature. Furthermore, it provides an innovative concept on which to base model selection: that of optimizing the number of reliable components instead of trying to fit the data. The current version of MetICA is available at https://github.com/daniellyz/MetICA. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0970-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4776428
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47764282016-03-04 MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics Liu, Youzhong Smirnov, Kirill Lucio, Marianna Gougeon, Régis D. Alexandre, Hervé Schmitt-Kopplin, Philippe BMC Bioinformatics Methodology Article BACKGROUND: Interpreting non-targeted metabolomics data remains a challenging task. Signals from non-targeted metabolomics studies stem from a combination of biological causes, complex interactions between them and experimental bias/noise. The resulting data matrix usually contain huge number of variables and only few samples, and classical techniques using nonlinear mapping could result in computational complexity and overfitting. Independent Component Analysis (ICA) as a linear method could potentially bring more meaningful results than Principal Component Analysis (PCA). However, a major problem with most ICA algorithms is the output variations between different runs and the result of a single ICA run should be interpreted with reserve. RESULTS: ICA was applied to simulated and experimental mass spectrometry (MS)-based non-targeted metabolomics data, under the hypothesis that underlying sources are mutually independent. Inspired from the Icasso algorithm, a new ICA method, MetICA was developed to handle the instability of ICA on complex datasets. Like the original Icasso algorithm, MetICA evaluated the algorithmic and statistical reliability of ICA runs. In addition, MetICA suggests two ways to select the optimal number of model components and gives an order of interpretation for the components obtained. CONCLUSIONS: Correlating the components obtained with prior biological knowledge allows understanding how non-targeted metabolomics data reflect biological nature and technical phenomena. We could also extract mass signals related to this information. This novel approach provides meaningful components due to their independent nature. Furthermore, it provides an innovative concept on which to base model selection: that of optimizing the number of reliable components instead of trying to fit the data. The current version of MetICA is available at https://github.com/daniellyz/MetICA. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0970-4) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-02 /pmc/articles/PMC4776428/ /pubmed/26936354 http://dx.doi.org/10.1186/s12859-016-0970-4 Text en © Liu et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Liu, Youzhong
Smirnov, Kirill
Lucio, Marianna
Gougeon, Régis D.
Alexandre, Hervé
Schmitt-Kopplin, Philippe
MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title_full MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title_fullStr MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title_full_unstemmed MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title_short MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
title_sort metica: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4776428/
https://www.ncbi.nlm.nih.gov/pubmed/26936354
http://dx.doi.org/10.1186/s12859-016-0970-4
work_keys_str_mv AT liuyouzhong meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics
AT smirnovkirill meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics
AT luciomarianna meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics
AT gougeonregisd meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics
AT alexandreherve meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics
AT schmittkopplinphilippe meticaindependentcomponentanalysisforhighresolutionmassspectrometrybasednontargetedmetabolomics