Cargando…

Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets

BACKGROUND: Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously...

Descripción completa

Detalles Bibliográficos
Autores principales: Gatti, Daniel M, Barry, William T, Nobel, Andrew B, Rusyn, Ivan, Wright, Fred A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091509/
https://www.ncbi.nlm.nih.gov/pubmed/20955544
http://dx.doi.org/10.1186/1471-2164-11-574
_version_ 1782203266355953664
author Gatti, Daniel M
Barry, William T
Nobel, Andrew B
Rusyn, Ivan
Wright, Fred A
author_facet Gatti, Daniel M
Barry, William T
Nobel, Andrew B
Rusyn, Ivan
Wright, Fred A
author_sort Gatti, Daniel M
collection PubMed
description BACKGROUND: Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon. RESULTS: We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data. CONCLUSIONS: These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.
format Text
id pubmed-3091509
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30915092011-05-10 Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets Gatti, Daniel M Barry, William T Nobel, Andrew B Rusyn, Ivan Wright, Fred A BMC Genomics Research Article BACKGROUND: Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon. RESULTS: We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data. CONCLUSIONS: These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature. BioMed Central 2010-10-18 /pmc/articles/PMC3091509/ /pubmed/20955544 http://dx.doi.org/10.1186/1471-2164-11-574 Text en Copyright ©2010 Gatti et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gatti, Daniel M
Barry, William T
Nobel, Andrew B
Rusyn, Ivan
Wright, Fred A
Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_full Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_fullStr Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_full_unstemmed Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_short Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_sort heading down the wrong pathway: on the influence of correlation within gene sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091509/
https://www.ncbi.nlm.nih.gov/pubmed/20955544
http://dx.doi.org/10.1186/1471-2164-11-574
work_keys_str_mv AT gattidanielm headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT barrywilliamt headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT nobelandrewb headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT rusynivan headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT wrightfreda headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets