Cargando…

DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes

BACKGROUND: In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different...

Descripción completa

Detalles Bibliográficos
Autores principales: Van Deun, Katrijn, Van Mechelen, Iven, Thorrez, Lieven, Schouteden, Martijn, De Moor, Bart, van der Werf, Mariët J., De Lathauwer, Lieven, Smilde, Age K., Kiers, Henk A. L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3365060/
https://www.ncbi.nlm.nih.gov/pubmed/22693578
http://dx.doi.org/10.1371/journal.pone.0037840
_version_ 1782234636280135680
author Van Deun, Katrijn
Van Mechelen, Iven
Thorrez, Lieven
Schouteden, Martijn
De Moor, Bart
van der Werf, Mariët J.
De Lathauwer, Lieven
Smilde, Age K.
Kiers, Henk A. L.
author_facet Van Deun, Katrijn
Van Mechelen, Iven
Thorrez, Lieven
Schouteden, Martijn
De Moor, Bart
van der Werf, Mariët J.
De Lathauwer, Lieven
Smilde, Age K.
Kiers, Henk A. L.
author_sort Van Deun, Katrijn
collection PubMed
description BACKGROUND: In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the important biological processes underlying the data and to disentangle therein processes common to all data sources and processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component analysis with rotation to common and distinctive components (DISCO-SCA). RESULTS: Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should be tailored to the biological question. CONCLUSIONS: Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and distinctive processes in multisource data. Open source code for both methods is provided.
format Online
Article
Text
id pubmed-3365060
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33650602012-06-12 DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes Van Deun, Katrijn Van Mechelen, Iven Thorrez, Lieven Schouteden, Martijn De Moor, Bart van der Werf, Mariët J. De Lathauwer, Lieven Smilde, Age K. Kiers, Henk A. L. PLoS One Research Article BACKGROUND: In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the important biological processes underlying the data and to disentangle therein processes common to all data sources and processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component analysis with rotation to common and distinctive components (DISCO-SCA). RESULTS: Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should be tailored to the biological question. CONCLUSIONS: Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and distinctive processes in multisource data. Open source code for both methods is provided. Public Library of Science 2012-05-31 /pmc/articles/PMC3365060/ /pubmed/22693578 http://dx.doi.org/10.1371/journal.pone.0037840 Text en Van Deun et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Van Deun, Katrijn
Van Mechelen, Iven
Thorrez, Lieven
Schouteden, Martijn
De Moor, Bart
van der Werf, Mariët J.
De Lathauwer, Lieven
Smilde, Age K.
Kiers, Henk A. L.
DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title_full DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title_fullStr DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title_full_unstemmed DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title_short DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes
title_sort disco-sca and properly applied gsvd as swinging methods to find common and distinctive processes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3365060/
https://www.ncbi.nlm.nih.gov/pubmed/22693578
http://dx.doi.org/10.1371/journal.pone.0037840
work_keys_str_mv AT vandeunkatrijn discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT vanmecheleniven discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT thorrezlieven discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT schoutedenmartijn discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT demoorbart discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT vanderwerfmarietj discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT delathauwerlieven discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT smildeagek discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses
AT kiershenkal discoscaandproperlyappliedgsvdasswingingmethodstofindcommonanddistinctiveprocesses