Cargando…

A power law global error model for the identification of differentially expressed genes in microarray data

BACKGROUND: High-density oligonucleotide microarray technology enables the discovery of genes that are transcriptionally modulated in different biological samples due to physiology, disease or intervention. Methods for the identification of these so-called "differentially expressed genes"...

Descripción completa

Detalles Bibliográficos
Autores principales: Pavelka, Norman, Pelizzola, Mattia, Vizzardelli, Caterina, Capozzoli, Monica, Splendiani, Andrea, Granucci, Francesca, Ricciardi-Castagnoli, Paola
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545082/
https://www.ncbi.nlm.nih.gov/pubmed/15606915
http://dx.doi.org/10.1186/1471-2105-5-203
_version_ 1782122195877625856
author Pavelka, Norman
Pelizzola, Mattia
Vizzardelli, Caterina
Capozzoli, Monica
Splendiani, Andrea
Granucci, Francesca
Ricciardi-Castagnoli, Paola
author_facet Pavelka, Norman
Pelizzola, Mattia
Vizzardelli, Caterina
Capozzoli, Monica
Splendiani, Andrea
Granucci, Francesca
Ricciardi-Castagnoli, Paola
author_sort Pavelka, Norman
collection PubMed
description BACKGROUND: High-density oligonucleotide microarray technology enables the discovery of genes that are transcriptionally modulated in different biological samples due to physiology, disease or intervention. Methods for the identification of these so-called "differentially expressed genes" (DEG) would largely benefit from a deeper knowledge of the intrinsic measurement variability. Though it is clear that variance of repeated measures is highly dependent on the average expression level of a given gene, there is still a lack of consensus on how signal reproducibility is linked to signal intensity. The aim of this study was to empirically model the variance versus mean dependence in microarray data to improve the performance of existing methods for identifying DEG. RESULTS: In the present work we used data generated by our lab as well as publicly available data sets to show that dispersion of repeated measures depends on location of the measures themselves following a power law. This enables us to construct a power law global error model (PLGEM) that is applicable to various Affymetrix GeneChip data sets. A new DEG identification method is therefore proposed, consisting of a statistic designed to make explicit use of model-derived measurement spread estimates and a resampling-based hypothesis testing algorithm. CONCLUSIONS: The new method provides a control of the false positive rate, a good sensitivity vs. specificity trade-off and consistent results with varying number of replicates and even using single samples.
format Text
id pubmed-545082
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5450822005-01-23 A power law global error model for the identification of differentially expressed genes in microarray data Pavelka, Norman Pelizzola, Mattia Vizzardelli, Caterina Capozzoli, Monica Splendiani, Andrea Granucci, Francesca Ricciardi-Castagnoli, Paola BMC Bioinformatics Methodology Article BACKGROUND: High-density oligonucleotide microarray technology enables the discovery of genes that are transcriptionally modulated in different biological samples due to physiology, disease or intervention. Methods for the identification of these so-called "differentially expressed genes" (DEG) would largely benefit from a deeper knowledge of the intrinsic measurement variability. Though it is clear that variance of repeated measures is highly dependent on the average expression level of a given gene, there is still a lack of consensus on how signal reproducibility is linked to signal intensity. The aim of this study was to empirically model the variance versus mean dependence in microarray data to improve the performance of existing methods for identifying DEG. RESULTS: In the present work we used data generated by our lab as well as publicly available data sets to show that dispersion of repeated measures depends on location of the measures themselves following a power law. This enables us to construct a power law global error model (PLGEM) that is applicable to various Affymetrix GeneChip data sets. A new DEG identification method is therefore proposed, consisting of a statistic designed to make explicit use of model-derived measurement spread estimates and a resampling-based hypothesis testing algorithm. CONCLUSIONS: The new method provides a control of the false positive rate, a good sensitivity vs. specificity trade-off and consistent results with varying number of replicates and even using single samples. BioMed Central 2004-12-17 /pmc/articles/PMC545082/ /pubmed/15606915 http://dx.doi.org/10.1186/1471-2105-5-203 Text en Copyright © 2004 Pavelka et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Pavelka, Norman
Pelizzola, Mattia
Vizzardelli, Caterina
Capozzoli, Monica
Splendiani, Andrea
Granucci, Francesca
Ricciardi-Castagnoli, Paola
A power law global error model for the identification of differentially expressed genes in microarray data
title A power law global error model for the identification of differentially expressed genes in microarray data
title_full A power law global error model for the identification of differentially expressed genes in microarray data
title_fullStr A power law global error model for the identification of differentially expressed genes in microarray data
title_full_unstemmed A power law global error model for the identification of differentially expressed genes in microarray data
title_short A power law global error model for the identification of differentially expressed genes in microarray data
title_sort power law global error model for the identification of differentially expressed genes in microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC545082/
https://www.ncbi.nlm.nih.gov/pubmed/15606915
http://dx.doi.org/10.1186/1471-2105-5-203
work_keys_str_mv AT pavelkanorman apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT pelizzolamattia apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT vizzardellicaterina apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT capozzolimonica apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT splendianiandrea apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT granuccifrancesca apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT ricciardicastagnolipaola apowerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT pavelkanorman powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT pelizzolamattia powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT vizzardellicaterina powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT capozzolimonica powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT splendianiandrea powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT granuccifrancesca powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata
AT ricciardicastagnolipaola powerlawglobalerrormodelfortheidentificationofdifferentiallyexpressedgenesinmicroarraydata