Cargando…

Investigating the effect of paralogs on microarray gene-set analysis

BACKGROUND: In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group g...

Descripción completa

Detalles Bibliográficos
Autores principales: Faure, Andre J, Seoighe, Cathal, Mulder, Nicola J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037853/
https://www.ncbi.nlm.nih.gov/pubmed/21261946
http://dx.doi.org/10.1186/1471-2105-12-29
_version_ 1782198018321154048
author Faure, Andre J
Seoighe, Cathal
Mulder, Nicola J
author_facet Faure, Andre J
Seoighe, Cathal
Mulder, Nicola J
author_sort Faure, Andre J
collection PubMed
description BACKGROUND: In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. RESULTS: We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. CONCLUSIONS: The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.
format Text
id pubmed-3037853
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30378532011-02-18 Investigating the effect of paralogs on microarray gene-set analysis Faure, Andre J Seoighe, Cathal Mulder, Nicola J BMC Bioinformatics Research Article BACKGROUND: In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. RESULTS: We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. CONCLUSIONS: The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies. BioMed Central 2011-01-24 /pmc/articles/PMC3037853/ /pubmed/21261946 http://dx.doi.org/10.1186/1471-2105-12-29 Text en Copyright ©2011 Faure et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Faure, Andre J
Seoighe, Cathal
Mulder, Nicola J
Investigating the effect of paralogs on microarray gene-set analysis
title Investigating the effect of paralogs on microarray gene-set analysis
title_full Investigating the effect of paralogs on microarray gene-set analysis
title_fullStr Investigating the effect of paralogs on microarray gene-set analysis
title_full_unstemmed Investigating the effect of paralogs on microarray gene-set analysis
title_short Investigating the effect of paralogs on microarray gene-set analysis
title_sort investigating the effect of paralogs on microarray gene-set analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037853/
https://www.ncbi.nlm.nih.gov/pubmed/21261946
http://dx.doi.org/10.1186/1471-2105-12-29
work_keys_str_mv AT faureandrej investigatingtheeffectofparalogsonmicroarraygenesetanalysis
AT seoighecathal investigatingtheeffectofparalogsonmicroarraygenesetanalysis
AT muldernicolaj investigatingtheeffectofparalogsonmicroarraygenesetanalysis