Cargando…

Multivariate curve resolution of time course microarray data

BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biologica...

Descripción completa

Detalles Bibliográficos
Autores principales: Wentzell, Peter D, Karakach, Tobias K, Roy, Sushmita, Martinez, M Juanita, Allen, Christopher P, Werner-Washburne, Margaret
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539028/
https://www.ncbi.nlm.nih.gov/pubmed/16839419
http://dx.doi.org/10.1186/1471-2105-7-343
_version_ 1782129162204479488
author Wentzell, Peter D
Karakach, Tobias K
Roy, Sushmita
Martinez, M Juanita
Allen, Christopher P
Werner-Washburne, Margaret
author_facet Wentzell, Peter D
Karakach, Tobias K
Roy, Sushmita
Martinez, M Juanita
Allen, Christopher P
Werner-Washburne, Margaret
author_sort Wentzell, Peter D
collection PubMed
description BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data. RESULTS: In this work, a method for the linear decomposition of gene expression data by multivariate curve resolution (MCR) is introduced. The MCR method is based on an alternating least-squares (ALS) algorithm implemented with a weighted least squares approach. The new method, MCR-WALS, extracts a small number of basis functions from untransformed microarray data using only non-negativity constraints. Measurement error information can be incorporated into the modeling process and missing data can be imputed. The utility of the method is demonstrated through its application to yeast cell cycle data. CONCLUSION: Profiles extracted by MCR-WALS exhibit a strong correlation with cell cycle-associated genes, but also suggest new insights into the regulation of those genes. The unique features of the MCR-WALS algorithm are its freedom from assumptions about the underlying linear model other than the non-negativity of gene expression, its ability to analyze non-log-transformed data, and its use of measurement error information to obtain a weighted model and accommodate missing measurements.
format Text
id pubmed-1539028
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15390282006-08-14 Multivariate curve resolution of time course microarray data Wentzell, Peter D Karakach, Tobias K Roy, Sushmita Martinez, M Juanita Allen, Christopher P Werner-Washburne, Margaret BMC Bioinformatics Methodology Article BACKGROUND: Modeling of gene expression data from time course experiments often involves the use of linear models such as those obtained from principal component analysis (PCA), independent component analysis (ICA), or other methods. Such methods do not generally yield factors with a clear biological interpretation. Moreover, implicit assumptions about the measurement errors often limit the application of these methods to log-transformed data, destroying linear structure in the untransformed expression data. RESULTS: In this work, a method for the linear decomposition of gene expression data by multivariate curve resolution (MCR) is introduced. The MCR method is based on an alternating least-squares (ALS) algorithm implemented with a weighted least squares approach. The new method, MCR-WALS, extracts a small number of basis functions from untransformed microarray data using only non-negativity constraints. Measurement error information can be incorporated into the modeling process and missing data can be imputed. The utility of the method is demonstrated through its application to yeast cell cycle data. CONCLUSION: Profiles extracted by MCR-WALS exhibit a strong correlation with cell cycle-associated genes, but also suggest new insights into the regulation of those genes. The unique features of the MCR-WALS algorithm are its freedom from assumptions about the underlying linear model other than the non-negativity of gene expression, its ability to analyze non-log-transformed data, and its use of measurement error information to obtain a weighted model and accommodate missing measurements. BioMed Central 2006-07-13 /pmc/articles/PMC1539028/ /pubmed/16839419 http://dx.doi.org/10.1186/1471-2105-7-343 Text en Copyright © 2006 Wentzell et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wentzell, Peter D
Karakach, Tobias K
Roy, Sushmita
Martinez, M Juanita
Allen, Christopher P
Werner-Washburne, Margaret
Multivariate curve resolution of time course microarray data
title Multivariate curve resolution of time course microarray data
title_full Multivariate curve resolution of time course microarray data
title_fullStr Multivariate curve resolution of time course microarray data
title_full_unstemmed Multivariate curve resolution of time course microarray data
title_short Multivariate curve resolution of time course microarray data
title_sort multivariate curve resolution of time course microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539028/
https://www.ncbi.nlm.nih.gov/pubmed/16839419
http://dx.doi.org/10.1186/1471-2105-7-343
work_keys_str_mv AT wentzellpeterd multivariatecurveresolutionoftimecoursemicroarraydata
AT karakachtobiask multivariatecurveresolutionoftimecoursemicroarraydata
AT roysushmita multivariatecurveresolutionoftimecoursemicroarraydata
AT martinezmjuanita multivariatecurveresolutionoftimecoursemicroarraydata
AT allenchristopherp multivariatecurveresolutionoftimecoursemicroarraydata
AT wernerwashburnemargaret multivariatecurveresolutionoftimecoursemicroarraydata