Cargando…

Combining transcriptional datasets using the generalized singular value decomposition

BACKGROUND: Both microarrays and quantitative real-time PCR are convenient tools for studying the transcriptional levels of genes. The former is preferable for large scale studies while the latter is a more targeted technique. Because of platform-dependent systematic effects, simple comparisons or m...

Descripción completa

Detalles Bibliográficos
Autores principales: Schreiber, Andreas W, Shirley, Neil J, Burton, Rachel A, Fincher, Geoffrey B
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2562393/
https://www.ncbi.nlm.nih.gov/pubmed/18687147
http://dx.doi.org/10.1186/1471-2105-9-335
_version_ 1782159747167813632
author Schreiber, Andreas W
Shirley, Neil J
Burton, Rachel A
Fincher, Geoffrey B
author_facet Schreiber, Andreas W
Shirley, Neil J
Burton, Rachel A
Fincher, Geoffrey B
author_sort Schreiber, Andreas W
collection PubMed
description BACKGROUND: Both microarrays and quantitative real-time PCR are convenient tools for studying the transcriptional levels of genes. The former is preferable for large scale studies while the latter is a more targeted technique. Because of platform-dependent systematic effects, simple comparisons or merging of datasets obtained by these technologies are difficult, even though they may often be desirable. These difficulties are exacerbated if there is only partial overlap between the experimental conditions and genes probed in the two datasets. RESULTS: We show here that the generalized singular value decomposition provides a practical tool for merging a small, targeted dataset obtained by quantitative real-time PCR of specific genes with a much larger microarray dataset. The technique permits, for the first time, the identification of genes present in only one dataset co-expressed with a target gene present exclusively in the other dataset, even when experimental conditions for the two datasets are not identical. With the rapidly increasing number of publically available large scale microarray datasets the latter is frequently the case. The method enables us to discover putative candidate genes involved in the biosynthesis of the (1,3;1,4)-β-D-glucan polysaccharide found in plant cell walls. CONCLUSION: We show that the generalized singular value decomposition provides a viable tool for a combined analysis of two gene expression datasets with only partial overlap of both gene sets and experimental conditions. We illustrate how the decomposition can be optimized self-consistently by using a judicious choice of genes to define it. The ability of the technique to seamlessly define a concept of "co-expression" across both datasets provides an avenue for meaningful data integration. We believe that it will prove to be particularly useful for exploiting large, publicly available, microarray datasets for species with unsequenced genomes by complementing them with more limited in-house expression measurements.
format Text
id pubmed-2562393
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25623932008-10-07 Combining transcriptional datasets using the generalized singular value decomposition Schreiber, Andreas W Shirley, Neil J Burton, Rachel A Fincher, Geoffrey B BMC Bioinformatics Methodology Article BACKGROUND: Both microarrays and quantitative real-time PCR are convenient tools for studying the transcriptional levels of genes. The former is preferable for large scale studies while the latter is a more targeted technique. Because of platform-dependent systematic effects, simple comparisons or merging of datasets obtained by these technologies are difficult, even though they may often be desirable. These difficulties are exacerbated if there is only partial overlap between the experimental conditions and genes probed in the two datasets. RESULTS: We show here that the generalized singular value decomposition provides a practical tool for merging a small, targeted dataset obtained by quantitative real-time PCR of specific genes with a much larger microarray dataset. The technique permits, for the first time, the identification of genes present in only one dataset co-expressed with a target gene present exclusively in the other dataset, even when experimental conditions for the two datasets are not identical. With the rapidly increasing number of publically available large scale microarray datasets the latter is frequently the case. The method enables us to discover putative candidate genes involved in the biosynthesis of the (1,3;1,4)-β-D-glucan polysaccharide found in plant cell walls. CONCLUSION: We show that the generalized singular value decomposition provides a viable tool for a combined analysis of two gene expression datasets with only partial overlap of both gene sets and experimental conditions. We illustrate how the decomposition can be optimized self-consistently by using a judicious choice of genes to define it. The ability of the technique to seamlessly define a concept of "co-expression" across both datasets provides an avenue for meaningful data integration. We believe that it will prove to be particularly useful for exploiting large, publicly available, microarray datasets for species with unsequenced genomes by complementing them with more limited in-house expression measurements. BioMed Central 2008-08-08 /pmc/articles/PMC2562393/ /pubmed/18687147 http://dx.doi.org/10.1186/1471-2105-9-335 Text en Copyright © 2008 Schreiber et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Schreiber, Andreas W
Shirley, Neil J
Burton, Rachel A
Fincher, Geoffrey B
Combining transcriptional datasets using the generalized singular value decomposition
title Combining transcriptional datasets using the generalized singular value decomposition
title_full Combining transcriptional datasets using the generalized singular value decomposition
title_fullStr Combining transcriptional datasets using the generalized singular value decomposition
title_full_unstemmed Combining transcriptional datasets using the generalized singular value decomposition
title_short Combining transcriptional datasets using the generalized singular value decomposition
title_sort combining transcriptional datasets using the generalized singular value decomposition
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2562393/
https://www.ncbi.nlm.nih.gov/pubmed/18687147
http://dx.doi.org/10.1186/1471-2105-9-335
work_keys_str_mv AT schreiberandreasw combiningtranscriptionaldatasetsusingthegeneralizedsingularvaluedecomposition
AT shirleyneilj combiningtranscriptionaldatasetsusingthegeneralizedsingularvaluedecomposition
AT burtonrachela combiningtranscriptionaldatasetsusingthegeneralizedsingularvaluedecomposition
AT finchergeoffreyb combiningtranscriptionaldatasetsusingthegeneralizedsingularvaluedecomposition