Cargando…

Nonlinear Dependence in the Discovery of Differentially Expressed Genes

Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to sa...

Descripción completa

Detalles Bibliográficos
Autores principales: Deller, J. R., Radha, Hayder, McCormick, J. Justin, Wang, Huiyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: International Scholarly Research Network 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393074/
https://www.ncbi.nlm.nih.gov/pubmed/25937940
http://dx.doi.org/10.5402/2012/564715
_version_ 1782366095507718144
author Deller, J. R.
Radha, Hayder
McCormick, J. Justin
Wang, Huiyan
author_facet Deller, J. R.
Radha, Hayder
McCormick, J. Justin
Wang, Huiyan
author_sort Deller, J. R.
collection PubMed
description Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ (0), of no difference in expression. A false discovery (type 1 error) occurs when ℍ (0) is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the “moment framework” by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations.
format Online
Article
Text
id pubmed-4393074
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher International Scholarly Research Network
record_format MEDLINE/PubMed
spelling pubmed-43930742015-05-03 Nonlinear Dependence in the Discovery of Differentially Expressed Genes Deller, J. R. Radha, Hayder McCormick, J. Justin Wang, Huiyan ISRN Bioinform Research Article Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ (0), of no difference in expression. A false discovery (type 1 error) occurs when ℍ (0) is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the “moment framework” by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations. International Scholarly Research Network 2012-04-12 /pmc/articles/PMC4393074/ /pubmed/25937940 http://dx.doi.org/10.5402/2012/564715 Text en Copyright © 2012 J. R. Deller Jr. et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Deller, J. R.
Radha, Hayder
McCormick, J. Justin
Wang, Huiyan
Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title_full Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title_fullStr Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title_full_unstemmed Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title_short Nonlinear Dependence in the Discovery of Differentially Expressed Genes
title_sort nonlinear dependence in the discovery of differentially expressed genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393074/
https://www.ncbi.nlm.nih.gov/pubmed/25937940
http://dx.doi.org/10.5402/2012/564715
work_keys_str_mv AT dellerjr nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes
AT radhahayder nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes
AT mccormickjjustin nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes
AT wanghuiyan nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes