Cargando…
Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function
BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the pr...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761411/ https://www.ncbi.nlm.nih.gov/pubmed/19775443 http://dx.doi.org/10.1186/1471-2105-10-307 |
_version_ | 1782172834718547968 |
---|---|
author | Törönen, Petri Ojala, Pauli J Marttinen, Pekka Holm, Liisa |
author_facet | Törönen, Petri Ojala, Pauli J Marttinen, Pekka Holm, Liisa |
author_sort | Törönen, Petri |
collection | PubMed |
description | BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. RESULTS: We introduce a scoring function, Gene Set Z-score (GSZ), for the analysis of functional class over-representation that combines two previous analysis methods. GSZ encompasses popular functions such as correlation, hypergeometric test, Max-Mean and Random Sets as limiting cases. GSZ is stable against changes in class size as well as across different positions of the analysed gene list in tests with randomized data. GSZ shows the best overall performance in a detailed comparison to popular functions using artificial data. Likewise, GSZ stands out in a cross-validation of methods using split real data. A comparison of empirical p-values further shows a strong difference in favour of GSZ, which clearly reports better p-values for top classes than the other methods. Furthermore, GSZ detects relevant biological themes that are missed by the other methods. These observations also hold when comparing GSZ with popular program packages. CONCLUSION: GSZ and improved versions of earlier methods are a useful contribution to the analysis of differential gene expression. The methods and supplementary material are available from the website http://ekhidna.biocenter.helsinki.fi/users/petri/public/GSZ/GSZscore.html. |
format | Text |
id | pubmed-2761411 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27614112009-10-14 Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function Törönen, Petri Ojala, Pauli J Marttinen, Pekka Holm, Liisa BMC Bioinformatics Research article BACKGROUND: A central task in contemporary biosciences is the identification of biological processes showing response in genome-wide differential gene expression experiments. Two types of analysis are common. Either, one generates an ordered list based on the differential expression values of the probed genes and examines the tail areas of the list for over-representation of various functional classes. Alternatively, one monitors the average differential expression level of genes belonging to a given functional class. So far these two types of method have not been combined. RESULTS: We introduce a scoring function, Gene Set Z-score (GSZ), for the analysis of functional class over-representation that combines two previous analysis methods. GSZ encompasses popular functions such as correlation, hypergeometric test, Max-Mean and Random Sets as limiting cases. GSZ is stable against changes in class size as well as across different positions of the analysed gene list in tests with randomized data. GSZ shows the best overall performance in a detailed comparison to popular functions using artificial data. Likewise, GSZ stands out in a cross-validation of methods using split real data. A comparison of empirical p-values further shows a strong difference in favour of GSZ, which clearly reports better p-values for top classes than the other methods. Furthermore, GSZ detects relevant biological themes that are missed by the other methods. These observations also hold when comparing GSZ with popular program packages. CONCLUSION: GSZ and improved versions of earlier methods are a useful contribution to the analysis of differential gene expression. The methods and supplementary material are available from the website http://ekhidna.biocenter.helsinki.fi/users/petri/public/GSZ/GSZscore.html. BioMed Central 2009-09-23 /pmc/articles/PMC2761411/ /pubmed/19775443 http://dx.doi.org/10.1186/1471-2105-10-307 Text en Copyright ©2009 Törönen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research article Törönen, Petri Ojala, Pauli J Marttinen, Pekka Holm, Liisa Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title | Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title_full | Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title_fullStr | Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title_full_unstemmed | Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title_short | Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
title_sort | robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function |
topic | Research article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761411/ https://www.ncbi.nlm.nih.gov/pubmed/19775443 http://dx.doi.org/10.1186/1471-2105-10-307 |
work_keys_str_mv | AT toronenpetri robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction AT ojalapaulij robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction AT marttinenpekka robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction AT holmliisa robustextractionoffunctionalsignalsfromgenesetanalysisusingageneralizedthresholdfreescoringfunction |