Cargando…

Filtering Genes for Cluster and Network Analysis

BACKGROUND: Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpreta...

Descripción completa

Detalles Bibliográficos
Autores principales: Tritchler, David, Parkhomenko, Elena, Beyene, Joseph
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2708160/
https://www.ncbi.nlm.nih.gov/pubmed/19549335
http://dx.doi.org/10.1186/1471-2105-10-193
_version_ 1782169202844499968
author Tritchler, David
Parkhomenko, Elena
Beyene, Joseph
author_facet Tritchler, David
Parkhomenko, Elena
Beyene, Joseph
author_sort Tritchler, David
collection PubMed
description BACKGROUND: Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias. RESULTS: This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks. CONCLUSION: The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.
format Text
id pubmed-2708160
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27081602009-07-09 Filtering Genes for Cluster and Network Analysis Tritchler, David Parkhomenko, Elena Beyene, Joseph BMC Bioinformatics Methodology Article BACKGROUND: Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias. RESULTS: This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks. CONCLUSION: The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated. BioMed Central 2009-06-23 /pmc/articles/PMC2708160/ /pubmed/19549335 http://dx.doi.org/10.1186/1471-2105-10-193 Text en Copyright © 2009 Tritchler et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Tritchler, David
Parkhomenko, Elena
Beyene, Joseph
Filtering Genes for Cluster and Network Analysis
title Filtering Genes for Cluster and Network Analysis
title_full Filtering Genes for Cluster and Network Analysis
title_fullStr Filtering Genes for Cluster and Network Analysis
title_full_unstemmed Filtering Genes for Cluster and Network Analysis
title_short Filtering Genes for Cluster and Network Analysis
title_sort filtering genes for cluster and network analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2708160/
https://www.ncbi.nlm.nih.gov/pubmed/19549335
http://dx.doi.org/10.1186/1471-2105-10-193
work_keys_str_mv AT tritchlerdavid filteringgenesforclusterandnetworkanalysis
AT parkhomenkoelena filteringgenesforclusterandnetworkanalysis
AT beyenejoseph filteringgenesforclusterandnetworkanalysis