Cargando…
Mining SOM expression portraits: feature selection and integrating concepts of molecular function
BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599960/ https://www.ncbi.nlm.nih.gov/pubmed/23043905 http://dx.doi.org/10.1186/1756-0381-5-18 |
_version_ | 1782475569168908288 |
---|---|
author | Wirth, Henry von Bergen, Martin Binder, Hans |
author_facet | Wirth, Henry von Bergen, Martin Binder, Hans |
author_sort | Wirth, Henry |
collection | PubMed |
description | BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function. RESULTS: Different expression scores based either on simple fold change-measures or on regularized Student’s t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and high expressed genes using SOM data filtering. CONCLUSIONS: The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones. |
format | Online Article Text |
id | pubmed-3599960 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35999602013-03-23 Mining SOM expression portraits: feature selection and integrating concepts of molecular function Wirth, Henry von Bergen, Martin Binder, Hans BioData Min Research BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function. RESULTS: Different expression scores based either on simple fold change-measures or on regularized Student’s t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and high expressed genes using SOM data filtering. CONCLUSIONS: The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones. BioMed Central 2012-10-08 /pmc/articles/PMC3599960/ /pubmed/23043905 http://dx.doi.org/10.1186/1756-0381-5-18 Text en Copyright ©2012 Wirth et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Wirth, Henry von Bergen, Martin Binder, Hans Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title | Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title_full | Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title_fullStr | Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title_full_unstemmed | Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title_short | Mining SOM expression portraits: feature selection and integrating concepts of molecular function |
title_sort | mining som expression portraits: feature selection and integrating concepts of molecular function |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599960/ https://www.ncbi.nlm.nih.gov/pubmed/23043905 http://dx.doi.org/10.1186/1756-0381-5-18 |
work_keys_str_mv | AT wirthhenry miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction AT vonbergenmartin miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction AT binderhans miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction |