Cargando…

Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures

Detection of yet unknown subgroups showing differential gene or protein expression is a frequent goal in the analysis of modern molecular data. Applications range from cancer biology over developmental biology to toxicology. Often a control and an experimental group are compared, and subgroups can b...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahrens, Maike, Turewicz, Michael, Casjens, Swaantje, May, Caroline, Pesch, Beate, Stephan, Christian, Woitalla, Dirk, Gold, Ralf, Brüning, Thomas, Meyer, Helmut E., Rahnenführer, Jörg, Eisenacher, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3838370/
https://www.ncbi.nlm.nih.gov/pubmed/24278130
http://dx.doi.org/10.1371/journal.pone.0079380
_version_ 1782478347777867776
author Ahrens, Maike
Turewicz, Michael
Casjens, Swaantje
May, Caroline
Pesch, Beate
Stephan, Christian
Woitalla, Dirk
Gold, Ralf
Brüning, Thomas
Meyer, Helmut E.
Rahnenführer, Jörg
Eisenacher, Martin
author_facet Ahrens, Maike
Turewicz, Michael
Casjens, Swaantje
May, Caroline
Pesch, Beate
Stephan, Christian
Woitalla, Dirk
Gold, Ralf
Brüning, Thomas
Meyer, Helmut E.
Rahnenführer, Jörg
Eisenacher, Martin
author_sort Ahrens, Maike
collection PubMed
description Detection of yet unknown subgroups showing differential gene or protein expression is a frequent goal in the analysis of modern molecular data. Applications range from cancer biology over developmental biology to toxicology. Often a control and an experimental group are compared, and subgroups can be characterized by differential expression for only a subgroup-specific set of genes or proteins. Finding such genes and corresponding patient subgroups can help in understanding pathological pathways, diagnosis and defining drug targets. The size of the subgroup and the type of differential expression determine the optimal strategy for subgroup identification. To date, commonly used software packages hardly provide statistical tests and methods for the detection of such subgroups. Different univariate methods for subgroup detection are characterized and compared, both on simulated and on real data. We present an advanced design for simulation studies: Data is simulated under different distributional assumptions for the expression of the subgroup, and performance results are compared against theoretical upper bounds. For each distribution, different degrees of deviation from the majority of observations are considered for the subgroup. We evaluate classical approaches as well as various new suggestions in the context of omics data, including outlier sum, PADGE, and kurtosis. We also propose the new FisherSum score. ROC curve analysis and AUC values are used to quantify the ability of the methods to distinguish between genes or proteins with and without certain subgroup patterns. In general, FisherSum for small subgroups and [Image: see text]-test for large subgroups achieve best results. We apply each method to a case-control study on Parkinson's disease and underline the biological benefit of the new method.
format Online
Article
Text
id pubmed-3838370
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38383702013-11-25 Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures Ahrens, Maike Turewicz, Michael Casjens, Swaantje May, Caroline Pesch, Beate Stephan, Christian Woitalla, Dirk Gold, Ralf Brüning, Thomas Meyer, Helmut E. Rahnenführer, Jörg Eisenacher, Martin PLoS One Research Article Detection of yet unknown subgroups showing differential gene or protein expression is a frequent goal in the analysis of modern molecular data. Applications range from cancer biology over developmental biology to toxicology. Often a control and an experimental group are compared, and subgroups can be characterized by differential expression for only a subgroup-specific set of genes or proteins. Finding such genes and corresponding patient subgroups can help in understanding pathological pathways, diagnosis and defining drug targets. The size of the subgroup and the type of differential expression determine the optimal strategy for subgroup identification. To date, commonly used software packages hardly provide statistical tests and methods for the detection of such subgroups. Different univariate methods for subgroup detection are characterized and compared, both on simulated and on real data. We present an advanced design for simulation studies: Data is simulated under different distributional assumptions for the expression of the subgroup, and performance results are compared against theoretical upper bounds. For each distribution, different degrees of deviation from the majority of observations are considered for the subgroup. We evaluate classical approaches as well as various new suggestions in the context of omics data, including outlier sum, PADGE, and kurtosis. We also propose the new FisherSum score. ROC curve analysis and AUC values are used to quantify the ability of the methods to distinguish between genes or proteins with and without certain subgroup patterns. In general, FisherSum for small subgroups and [Image: see text]-test for large subgroups achieve best results. We apply each method to a case-control study on Parkinson's disease and underline the biological benefit of the new method. Public Library of Science 2013-11-22 /pmc/articles/PMC3838370/ /pubmed/24278130 http://dx.doi.org/10.1371/journal.pone.0079380 Text en © 2013 Ahrens et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Ahrens, Maike
Turewicz, Michael
Casjens, Swaantje
May, Caroline
Pesch, Beate
Stephan, Christian
Woitalla, Dirk
Gold, Ralf
Brüning, Thomas
Meyer, Helmut E.
Rahnenführer, Jörg
Eisenacher, Martin
Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title_full Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title_fullStr Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title_full_unstemmed Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title_short Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures
title_sort detection of patient subgroups with differential expression in omics data: a comprehensive comparison of univariate measures
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3838370/
https://www.ncbi.nlm.nih.gov/pubmed/24278130
http://dx.doi.org/10.1371/journal.pone.0079380
work_keys_str_mv AT ahrensmaike detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT turewiczmichael detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT casjensswaantje detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT maycaroline detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT peschbeate detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT stephanchristian detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT woitalladirk detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT goldralf detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT bruningthomas detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT meyerhelmute detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT rahnenfuhrerjorg detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures
AT eisenachermartin detectionofpatientsubgroupswithdifferentialexpressioninomicsdataacomprehensivecomparisonofunivariatemeasures