Cargando…
Nonlinear Dependence in the Discovery of Differentially Expressed Genes
Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to sa...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
International Scholarly Research Network
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393074/ https://www.ncbi.nlm.nih.gov/pubmed/25937940 http://dx.doi.org/10.5402/2012/564715 |
_version_ | 1782366095507718144 |
---|---|
author | Deller, J. R. Radha, Hayder McCormick, J. Justin Wang, Huiyan |
author_facet | Deller, J. R. Radha, Hayder McCormick, J. Justin Wang, Huiyan |
author_sort | Deller, J. R. |
collection | PubMed |
description | Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ (0), of no difference in expression. A false discovery (type 1 error) occurs when ℍ (0) is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the “moment framework” by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations. |
format | Online Article Text |
id | pubmed-4393074 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | International Scholarly Research Network |
record_format | MEDLINE/PubMed |
spelling | pubmed-43930742015-05-03 Nonlinear Dependence in the Discovery of Differentially Expressed Genes Deller, J. R. Radha, Hayder McCormick, J. Justin Wang, Huiyan ISRN Bioinform Research Article Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ (0), of no difference in expression. A false discovery (type 1 error) occurs when ℍ (0) is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the “moment framework” by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations. International Scholarly Research Network 2012-04-12 /pmc/articles/PMC4393074/ /pubmed/25937940 http://dx.doi.org/10.5402/2012/564715 Text en Copyright © 2012 J. R. Deller Jr. et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Deller, J. R. Radha, Hayder McCormick, J. Justin Wang, Huiyan Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title | Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title_full | Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title_fullStr | Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title_full_unstemmed | Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title_short | Nonlinear Dependence in the Discovery of Differentially Expressed Genes |
title_sort | nonlinear dependence in the discovery of differentially expressed genes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393074/ https://www.ncbi.nlm.nih.gov/pubmed/25937940 http://dx.doi.org/10.5402/2012/564715 |
work_keys_str_mv | AT dellerjr nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes AT radhahayder nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes AT mccormickjjustin nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes AT wanghuiyan nonlineardependenceinthediscoveryofdifferentiallyexpressedgenes |