Cargando…

Sparse representation approaches for the classification of high-dimensional biological data

BACKGROUND: High-throughput genomic and proteomic data have important applications in medicine including prevention, diagnosis, treatment, and prognosis of diseases, and molecular biology, for example pathway identification. Many of such applications can be formulated to classification and dimension...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Yifeng, Ngom, Alioune
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3854665/ https://www.ncbi.nlm.nih.gov/pubmed/24565287 http://dx.doi.org/10.1186/1752-0509-7-S4-S6

_version_	1782294843060387840
author	Li, Yifeng Ngom, Alioune
author_facet	Li, Yifeng Ngom, Alioune
author_sort	Li, Yifeng
collection	PubMed
description	BACKGROUND: High-throughput genomic and proteomic data have important applications in medicine including prevention, diagnosis, treatment, and prognosis of diseases, and molecular biology, for example pathway identification. Many of such applications can be formulated to classification and dimension reduction problems in machine learning. There are computationally challenging issues with regards to accurately classifying such data, and which due to dimensionality, noise and redundancy, to name a few. The principle of sparse representation has been applied to analyzing high-dimensional biological data within the frameworks of clustering, classification, and dimension reduction approaches. However, the existing sparse representation methods are inefficient. The kernel extensions are not well addressed either. Moreover, the sparse representation techniques have not been comprehensively studied yet in bioinformatics. RESULTS: In this paper, a Bayesian treatment is presented on sparse representations. Various sparse coding and dictionary learning models are discussed. We propose fast parallel active-set optimization algorithm for each model. Kernel versions are devised based on their dimension-free property. These models are applied for classifying high-dimensional biological data. CONCLUSIONS: In our experiment, we compared our models with other methods on both accuracy and computing time. It is shown that our models can achieve satisfactory accuracy, and their performance are very efficient.
format	Online Article Text
id	pubmed-3854665
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38546652013-12-16 Sparse representation approaches for the classification of high-dimensional biological data Li, Yifeng Ngom, Alioune BMC Syst Biol Research BACKGROUND: High-throughput genomic and proteomic data have important applications in medicine including prevention, diagnosis, treatment, and prognosis of diseases, and molecular biology, for example pathway identification. Many of such applications can be formulated to classification and dimension reduction problems in machine learning. There are computationally challenging issues with regards to accurately classifying such data, and which due to dimensionality, noise and redundancy, to name a few. The principle of sparse representation has been applied to analyzing high-dimensional biological data within the frameworks of clustering, classification, and dimension reduction approaches. However, the existing sparse representation methods are inefficient. The kernel extensions are not well addressed either. Moreover, the sparse representation techniques have not been comprehensively studied yet in bioinformatics. RESULTS: In this paper, a Bayesian treatment is presented on sparse representations. Various sparse coding and dictionary learning models are discussed. We propose fast parallel active-set optimization algorithm for each model. Kernel versions are devised based on their dimension-free property. These models are applied for classifying high-dimensional biological data. CONCLUSIONS: In our experiment, we compared our models with other methods on both accuracy and computing time. It is shown that our models can achieve satisfactory accuracy, and their performance are very efficient. BioMed Central 2013-10-23 /pmc/articles/PMC3854665/ /pubmed/24565287 http://dx.doi.org/10.1186/1752-0509-7-S4-S6 Text en Copyright © 2013 Li and Ngom; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Li, Yifeng Ngom, Alioune Sparse representation approaches for the classification of high-dimensional biological data
title	Sparse representation approaches for the classification of high-dimensional biological data
title_full	Sparse representation approaches for the classification of high-dimensional biological data
title_fullStr	Sparse representation approaches for the classification of high-dimensional biological data
title_full_unstemmed	Sparse representation approaches for the classification of high-dimensional biological data
title_short	Sparse representation approaches for the classification of high-dimensional biological data
title_sort	sparse representation approaches for the classification of high-dimensional biological data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3854665/ https://www.ncbi.nlm.nih.gov/pubmed/24565287 http://dx.doi.org/10.1186/1752-0509-7-S4-S6
work_keys_str_mv	AT liyifeng sparserepresentationapproachesfortheclassificationofhighdimensionalbiologicaldata AT ngomalioune sparserepresentationapproachesfortheclassificationofhighdimensionalbiologicaldata

Sparse representation approaches for the classification of high-dimensional biological data

Ejemplares similares