Cargando…

Mining SOM expression portraits: feature selection and integrating concepts of molecular function

BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering a...

Descripción completa

Detalles Bibliográficos
Autores principales: Wirth, Henry, von Bergen, Martin, Binder, Hans
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599960/
https://www.ncbi.nlm.nih.gov/pubmed/23043905
http://dx.doi.org/10.1186/1756-0381-5-18
_version_ 1782475569168908288
author Wirth, Henry
von Bergen, Martin
Binder, Hans
author_facet Wirth, Henry
von Bergen, Martin
Binder, Hans
author_sort Wirth, Henry
collection PubMed
description BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function. RESULTS: Different expression scores based either on simple fold change-measures or on regularized Student’s t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and high expressed genes using SOM data filtering. CONCLUSIONS: The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones.
format Online
Article
Text
id pubmed-3599960
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35999602013-03-23 Mining SOM expression portraits: feature selection and integrating concepts of molecular function Wirth, Henry von Bergen, Martin Binder, Hans BioData Min Research BACKGROUND: Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function. RESULTS: Different expression scores based either on simple fold change-measures or on regularized Student’s t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and high expressed genes using SOM data filtering. CONCLUSIONS: The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing ones. BioMed Central 2012-10-08 /pmc/articles/PMC3599960/ /pubmed/23043905 http://dx.doi.org/10.1186/1756-0381-5-18 Text en Copyright ©2012 Wirth et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Wirth, Henry
von Bergen, Martin
Binder, Hans
Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title_full Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title_fullStr Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title_full_unstemmed Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title_short Mining SOM expression portraits: feature selection and integrating concepts of molecular function
title_sort mining som expression portraits: feature selection and integrating concepts of molecular function
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3599960/
https://www.ncbi.nlm.nih.gov/pubmed/23043905
http://dx.doi.org/10.1186/1756-0381-5-18
work_keys_str_mv AT wirthhenry miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction
AT vonbergenmartin miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction
AT binderhans miningsomexpressionportraitsfeatureselectionandintegratingconceptsofmolecularfunction