Cargando…

Spectral embedding finds meaningful (relevant) structure in image and microarray data

BACKGROUND: Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR)...

Descripción completa

Detalles Bibliográficos
Autores principales:	Higgs, Brandon W, Weller, Jennifer, Solka, Jeffrey L
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1395341/ https://www.ncbi.nlm.nih.gov/pubmed/16483359 http://dx.doi.org/10.1186/1471-2105-7-74

_version_	1782126951238991872
author	Higgs, Brandon W Weller, Jennifer Solka, Jeffrey L
author_facet	Higgs, Brandon W Weller, Jennifer Solka, Jeffrey L
author_sort	Higgs, Brandon W
collection	PubMed
description	BACKGROUND: Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. RESULTS: We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. CONCLUSION: Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology.
format	Text
id	pubmed-1395341
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-13953412006-04-21 Spectral embedding finds meaningful (relevant) structure in image and microarray data Higgs, Brandon W Weller, Jennifer Solka, Jeffrey L BMC Bioinformatics Research Article BACKGROUND: Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. RESULTS: We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. CONCLUSION: Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology. BioMed Central 2006-02-16 /pmc/articles/PMC1395341/ /pubmed/16483359 http://dx.doi.org/10.1186/1471-2105-7-74 Text en Copyright © 2006 Higgs et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Higgs, Brandon W Weller, Jennifer Solka, Jeffrey L Spectral embedding finds meaningful (relevant) structure in image and microarray data
title	Spectral embedding finds meaningful (relevant) structure in image and microarray data
title_full	Spectral embedding finds meaningful (relevant) structure in image and microarray data
title_fullStr	Spectral embedding finds meaningful (relevant) structure in image and microarray data
title_full_unstemmed	Spectral embedding finds meaningful (relevant) structure in image and microarray data
title_short	Spectral embedding finds meaningful (relevant) structure in image and microarray data
title_sort	spectral embedding finds meaningful (relevant) structure in image and microarray data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1395341/ https://www.ncbi.nlm.nih.gov/pubmed/16483359 http://dx.doi.org/10.1186/1471-2105-7-74
work_keys_str_mv	AT higgsbrandonw spectralembeddingfindsmeaningfulrelevantstructureinimageandmicroarraydata AT wellerjennifer spectralembeddingfindsmeaningfulrelevantstructureinimageandmicroarraydata AT solkajeffreyl spectralembeddingfindsmeaningfulrelevantstructureinimageandmicroarraydata

Spectral embedding finds meaningful (relevant) structure in image and microarray data

Ejemplares similares