Cargando…

Down-weighting overlapping genes improves gene set analysis

BACKGROUND: The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. RESULTS: In this wor...

Descripción completa

Detalles Bibliográficos
Autores principales: Tarca, Adi Laurentiu, Draghici, Sorin, Bhatti, Gaurav, Romero, Roberto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3443069/
https://www.ncbi.nlm.nih.gov/pubmed/22713124
http://dx.doi.org/10.1186/1471-2105-13-136
_version_ 1782243514466172928
author Tarca, Adi Laurentiu
Draghici, Sorin
Bhatti, Gaurav
Romero, Roberto
author_facet Tarca, Adi Laurentiu
Draghici, Sorin
Bhatti, Gaurav
Romero, Roberto
author_sort Tarca, Adi Laurentiu
collection PubMed
description BACKGROUND: The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. RESULTS: In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. CONCLUSIONS: PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org.
format Online
Article
Text
id pubmed-3443069
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34430692012-09-18 Down-weighting overlapping genes improves gene set analysis Tarca, Adi Laurentiu Draghici, Sorin Bhatti, Gaurav Romero, Roberto BMC Bioinformatics Research Article BACKGROUND: The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. RESULTS: In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. CONCLUSIONS: PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. BioMed Central 2012-06-19 /pmc/articles/PMC3443069/ /pubmed/22713124 http://dx.doi.org/10.1186/1471-2105-13-136 Text en Copyright ©2012 Tarca et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tarca, Adi Laurentiu
Draghici, Sorin
Bhatti, Gaurav
Romero, Roberto
Down-weighting overlapping genes improves gene set analysis
title Down-weighting overlapping genes improves gene set analysis
title_full Down-weighting overlapping genes improves gene set analysis
title_fullStr Down-weighting overlapping genes improves gene set analysis
title_full_unstemmed Down-weighting overlapping genes improves gene set analysis
title_short Down-weighting overlapping genes improves gene set analysis
title_sort down-weighting overlapping genes improves gene set analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3443069/
https://www.ncbi.nlm.nih.gov/pubmed/22713124
http://dx.doi.org/10.1186/1471-2105-13-136
work_keys_str_mv AT tarcaadilaurentiu downweightingoverlappinggenesimprovesgenesetanalysis
AT draghicisorin downweightingoverlappinggenesimprovesgenesetanalysis
AT bhattigaurav downweightingoverlappinggenesimprovesgenesetanalysis
AT romeroroberto downweightingoverlappinggenesimprovesgenesetanalysis