Cargando…
Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data
BACKGROUND: Dimensionality reduction (DR) enables the construction of a lower dimensional space (embedding) from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395843/ https://www.ncbi.nlm.nih.gov/pubmed/22316103 http://dx.doi.org/10.1186/1471-2105-13-26 |
_version_ | 1782238044133261312 |
---|---|
author | Viswanath, Satish Madabhushi, Anant |
author_facet | Viswanath, Satish Madabhushi, Anant |
author_sort | Viswanath, Satish |
collection | PubMed |
description | BACKGROUND: Dimensionality reduction (DR) enables the construction of a lower dimensional space (embedding) from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding). Intelligent sub-sampling (via mean-shift) and code parallelization are utilized to provide for an efficient implementation of the scheme. RESULTS: Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1) image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2) classification of 4 high-dimensional gene-expression datasets, (3) cancer detection (at a pixel-level) on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. CONCLUSIONS: We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range of high-dimensional biomedical data classification and segmentation problems. Our generalizable framework allows for improved representation and classification in the context of both imaging and non-imaging data. The algorithm offers a promising solution to problems that currently plague DR methods, and may allow for extension to other areas of biomedical data analysis. |
format | Online Article Text |
id | pubmed-3395843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33958432012-07-16 Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data Viswanath, Satish Madabhushi, Anant BMC Bioinformatics Methodology Article BACKGROUND: Dimensionality reduction (DR) enables the construction of a lower dimensional space (embedding) from a higher dimensional feature space while preserving object-class discriminability. However several popular DR approaches suffer from sensitivity to choice of parameters and/or presence of noise in the data. In this paper, we present a novel DR technique known as consensus embedding that aims to overcome these problems by generating and combining multiple low-dimensional embeddings, hence exploiting the variance among them in a manner similar to ensemble classifier schemes such as Bagging. We demonstrate theoretical properties of consensus embedding which show that it will result in a single stable embedding solution that preserves information more accurately as compared to any individual embedding (generated via DR schemes such as Principal Component Analysis, Graph Embedding, or Locally Linear Embedding). Intelligent sub-sampling (via mean-shift) and code parallelization are utilized to provide for an efficient implementation of the scheme. RESULTS: Applications of consensus embedding are shown in the context of classification and clustering as applied to: (1) image partitioning of white matter and gray matter on 10 different synthetic brain MRI images corrupted with 18 different combinations of noise and bias field inhomogeneity, (2) classification of 4 high-dimensional gene-expression datasets, (3) cancer detection (at a pixel-level) on 16 image slices obtained from 2 different high-resolution prostate MRI datasets. In over 200 different experiments concerning classification and segmentation of biomedical data, consensus embedding was found to consistently outperform both linear and non-linear DR methods within all applications considered. CONCLUSIONS: We have presented a novel framework termed consensus embedding which leverages ensemble classification theory within dimensionality reduction, allowing for application to a wide range of high-dimensional biomedical data classification and segmentation problems. Our generalizable framework allows for improved representation and classification in the context of both imaging and non-imaging data. The algorithm offers a promising solution to problems that currently plague DR methods, and may allow for extension to other areas of biomedical data analysis. BioMed Central 2012-02-08 /pmc/articles/PMC3395843/ /pubmed/22316103 http://dx.doi.org/10.1186/1471-2105-13-26 Text en Copyright ©2012 Viswanath and Madabhushi; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Viswanath, Satish Madabhushi, Anant Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title | Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title_full | Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title_fullStr | Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title_full_unstemmed | Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title_short | Consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
title_sort | consensus embedding: theory, algorithms and application to segmentation and classification of biomedical data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3395843/ https://www.ncbi.nlm.nih.gov/pubmed/22316103 http://dx.doi.org/10.1186/1471-2105-13-26 |
work_keys_str_mv | AT viswanathsatish consensusembeddingtheoryalgorithmsandapplicationtosegmentationandclassificationofbiomedicaldata AT madabhushianant consensusembeddingtheoryalgorithmsandapplicationtosegmentationandclassificationofbiomedicaldata |