Cargando…

MAID : An effect size based model for microarray data integration across laboratories and platforms

BACKGROUND: Gene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions. As microarray technology matures, the number of microarray studies has increased, resulting in many different datasets available for...

Descripción completa

Detalles Bibliográficos
Autores principales: Borozan, Ivan, Chen, Limin, Paeper, Bryan, Heathcote, Jenny E, Edwards, Aled M, Katze, Michael, Zhang, Zhaolei, McGilvray, Ian D
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483727/
https://www.ncbi.nlm.nih.gov/pubmed/18616827
http://dx.doi.org/10.1186/1471-2105-9-305
_version_ 1782158064035561472
author Borozan, Ivan
Chen, Limin
Paeper, Bryan
Heathcote, Jenny E
Edwards, Aled M
Katze, Michael
Zhang, Zhaolei
McGilvray, Ian D
author_facet Borozan, Ivan
Chen, Limin
Paeper, Bryan
Heathcote, Jenny E
Edwards, Aled M
Katze, Michael
Zhang, Zhaolei
McGilvray, Ian D
author_sort Borozan, Ivan
collection PubMed
description BACKGROUND: Gene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions. As microarray technology matures, the number of microarray studies has increased, resulting in many different datasets available for any given disease. The increase in sensitivity and reliability of measurements of gene expression changes can be improved through a systematic integration of different microarray datasets that address the same or similar biological questions. RESULTS: Traditional effect size models can not be used to integrate array data that directly compare treatment to control samples expressed as log ratios of gene expressions. Here we extend the traditional effect size model to integrate as many array datasets as possible. The extended effect size model (MAID) can integrate any array datatype generated with either single or two channel arrays using either direct or indirect designs across different laboratories and platforms. The model uses two standardized indices, the standard effect size score for experiments with two groups of data, and a new standardized index that measures the difference in gene expression between treatment and control groups for one sample data with replicate arrays. The statistical significance of treatment effect across studies for each gene is determined by appropriate permutation methods depending on the type of data integrated. We apply our method to three different expression datasets from two different laboratories generated using three different array platforms and two different experimental designs. Our results indicate that the proposed integration model produces an increase in statistical power for identifying differentially expressed genes when integrating data across experiments and when compared to other integration models. We also show that genes found to be significant using our data integration method are of direct biological relevance to the three experiments integrated. CONCLUSION: High-throughput genomics data provide a rich and complex source of information that could play a key role in deciphering intricate molecular networks behind disease. Here we propose an extension of the traditional effect size model to allow the integration of as many array experiments as possible with the aim of increasing the statistical power for identifying differentially expressed genes.
format Text
id pubmed-2483727
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-24837272008-07-28 MAID : An effect size based model for microarray data integration across laboratories and platforms Borozan, Ivan Chen, Limin Paeper, Bryan Heathcote, Jenny E Edwards, Aled M Katze, Michael Zhang, Zhaolei McGilvray, Ian D BMC Bioinformatics Methodology Article BACKGROUND: Gene expression profiling has the potential to unravel molecular mechanisms behind gene regulation and identify gene targets for therapeutic interventions. As microarray technology matures, the number of microarray studies has increased, resulting in many different datasets available for any given disease. The increase in sensitivity and reliability of measurements of gene expression changes can be improved through a systematic integration of different microarray datasets that address the same or similar biological questions. RESULTS: Traditional effect size models can not be used to integrate array data that directly compare treatment to control samples expressed as log ratios of gene expressions. Here we extend the traditional effect size model to integrate as many array datasets as possible. The extended effect size model (MAID) can integrate any array datatype generated with either single or two channel arrays using either direct or indirect designs across different laboratories and platforms. The model uses two standardized indices, the standard effect size score for experiments with two groups of data, and a new standardized index that measures the difference in gene expression between treatment and control groups for one sample data with replicate arrays. The statistical significance of treatment effect across studies for each gene is determined by appropriate permutation methods depending on the type of data integrated. We apply our method to three different expression datasets from two different laboratories generated using three different array platforms and two different experimental designs. Our results indicate that the proposed integration model produces an increase in statistical power for identifying differentially expressed genes when integrating data across experiments and when compared to other integration models. We also show that genes found to be significant using our data integration method are of direct biological relevance to the three experiments integrated. CONCLUSION: High-throughput genomics data provide a rich and complex source of information that could play a key role in deciphering intricate molecular networks behind disease. Here we propose an extension of the traditional effect size model to allow the integration of as many array experiments as possible with the aim of increasing the statistical power for identifying differentially expressed genes. BioMed Central 2008-07-10 /pmc/articles/PMC2483727/ /pubmed/18616827 http://dx.doi.org/10.1186/1471-2105-9-305 Text en Copyright © 2008 Borozan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Borozan, Ivan
Chen, Limin
Paeper, Bryan
Heathcote, Jenny E
Edwards, Aled M
Katze, Michael
Zhang, Zhaolei
McGilvray, Ian D
MAID : An effect size based model for microarray data integration across laboratories and platforms
title MAID : An effect size based model for microarray data integration across laboratories and platforms
title_full MAID : An effect size based model for microarray data integration across laboratories and platforms
title_fullStr MAID : An effect size based model for microarray data integration across laboratories and platforms
title_full_unstemmed MAID : An effect size based model for microarray data integration across laboratories and platforms
title_short MAID : An effect size based model for microarray data integration across laboratories and platforms
title_sort maid : an effect size based model for microarray data integration across laboratories and platforms
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2483727/
https://www.ncbi.nlm.nih.gov/pubmed/18616827
http://dx.doi.org/10.1186/1471-2105-9-305
work_keys_str_mv AT borozanivan maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT chenlimin maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT paeperbryan maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT heathcotejennye maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT edwardsaledm maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT katzemichael maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT zhangzhaolei maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms
AT mcgilvrayiand maidaneffectsizebasedmodelformicroarraydataintegrationacrosslaboratoriesandplatforms