Cargando…

Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function

BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Törönen, Petri, Ojala, Pauli J, Marttinen, Pekka, Holm, Liisa
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761411/
https://www.ncbi.nlm.nih.gov/pubmed/19775443
http://dx.doi.org/10.1186/1471-2105-10-307
_version_ 1782172834718547968
author Törönen, Petri
Ojala, Pauli J
Marttinen, Pekka
Holm, Liisa
author_facet Törönen, Petri
Ojala, Pauli J
Marttinen, Pekka
Holm, Liisa
author_sort Törönen, Petri
collection PubMed
description BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. RESULTS: We introduce a scoring function, Gene Set Z-score (GSZ), for the analysis of functional class over-representation that combines two previous analysis methods. GSZ encompasses popular functions such as correlation, hypergeometric test, Max-Mean and Random Sets as limiting cases. GSZ is stable against changes in class size as well as across different positions of the analysed gene list in tests with randomized data. GSZ shows the best overall performance in a detailed comparison to popular functions using artificial data. Likewise, GSZ stands out in a cross-validation of methods using split real data. A comparison of empirical p-values further shows a strong difference in favour of GSZ, which clearly reports better p-values for top classes than the other methods. Furthermore, GSZ detects relevant biological themes that are missed by the other methods. These observations also hold when comparing GSZ with popular program packages. CONCLUSION: GSZ and improved versions of earlier methods are a useful contribution to the analysis of differential gene expression. The methods and supplementary material are available from the website http://ekhidna.biocenter.helsinki.fi/users/petri/public/GSZ/GSZscore.html.
format Text
id pubmed-2761411
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27614112009-10-14 Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function Törönen, Petri Ojala, Pauli J Marttinen, Pekka Holm, Liisa BMC Bioinformatics Research article BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. RESULTS: We introduce a scoring function, Gene Set Z-score (GSZ), for the analysis of functional class over-representation that combines two previous analysis methods. GSZ encompasses popular functions such as correlation, hypergeometric test, Max-Mean and Random Sets as limiting cases. GSZ is stable against changes in class size as well as across different positions of the analysed gene list in tests with randomized data. GSZ shows the best overall performance in a detailed comparison to popular functions using artificial data. Likewise, GSZ stands out in a cross-validation of methods using split real data. A comparison of empirical p-values further shows a strong difference in favour of GSZ, which clearly reports better p-values for top classes than the other methods. Furthermore, GSZ detects relevant biological themes that are missed by the other methods. These observations also hold when comparing GSZ with popular program packages. CONCLUSION: GSZ and improved versions of earlier methods are a useful contribution to the analysis of differential gene expression. The methods and supplementary material are available from the website http://ekhidna.biocenter.helsinki.fi/users/petri/public/GSZ/GSZscore.html. BioMed Central 2009-09-23 /pmc/articles/PMC2761411/ /pubmed/19775443 http://dx.doi.org/10.1186/1471-2105-10-307 Text en Copyright ©2009 Törönen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Törönen, Petri
Ojala, Pauli J
Marttinen, Pekka
Holm, Liisa
Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title_full Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title_fullStr Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title_full_unstemmed Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title_short Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
title_sort robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761411/
https://www.ncbi.nlm.nih.gov/pubmed/19775443
http://dx.doi.org/10.1186/1471-2105-10-307
work_keys_str_mv AT toronenpetri robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction
AT ojalapaulij robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction
AT marttinenpekka robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction
AT holmliisa robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction