Cargando…

High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

BACKGROUND: Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dozmorov, Mikhail G, Wren, Jonathan D
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236842/ https://www.ncbi.nlm.nih.gov/pubmed/22166002 http://dx.doi.org/10.1186/1471-2105-12-S10-S2

_version_	1782218794283827200
author	Dozmorov, Mikhail G Wren, Jonathan D
author_facet	Dozmorov, Mikhail G Wren, Jonathan D
author_sort	Dozmorov, Mikhail G
collection	PubMed
description	BACKGROUND: Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses. METHODS: We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses. RESULTS: 13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data. CONCLUSIONS: Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem. Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request.
format	Online Article Text
id	pubmed-3236842
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32368422011-12-14 High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses Dozmorov, Mikhail G Wren, Jonathan D BMC Bioinformatics Proceedings BACKGROUND: Microarray experiments are becoming increasingly common in biomedical research, as is their deposition in publicly accessible repositories, such as Gene Expression Omnibus (GEO). As such, there has been a surge in interest to use this microarray data for meta-analytic approaches, whether to increase sample size for a more powerful analysis of a specific disease (e.g. lung cancer) or to re-examine experiments for reasons different than those examined in the initial, publishing study that generated them. For the average biomedical researcher, there are a number of practical barriers to conducting such meta-analyses such as manually aggregating, filtering and formatting the data. Methods to automatically process large repositories of microarray data into a standardized, directly comparable format will enable easier and more reliable access to microarray data to conduct meta-analyses. METHODS: We present a straightforward, simple but robust against potential outliers method for automatic quality control and pre-processing of tens of thousands of single-channel microarray data files. GEO GDS files are quality checked by comparing parametric distributions and quantile normalized to enable direct comparison of expression level for subsequent meta-analyses. RESULTS: 13,000 human 1-color experiments were processed to create a single gene expression matrix that subsets can be extracted from to conduct meta-analyses. Interestingly, we found that when conducting a global meta-analysis of gene-gene co-expression patterns across all 13,000 experiments to predict gene function, normalization had minimal improvement over using the raw data. CONCLUSIONS: Normalization of microarray data appears to be of minimal importance on analyses based on co-expression patterns when the sample size is on the order of thousands microarray datasets. Smaller subsets, however, are more prone to aberrations and artefacts, and effective means of automating normalization procedures not only empowers meta-analytic approaches, but aids in reproducibility by providing a standard way of approaching the problem. Data availability: matrix containing normalized expression of 20,813 genes across 13,000 experiments is available for download at . Source code for GDS files pre-processing is available from the authors upon request. BioMed Central 2011-10-18 /pmc/articles/PMC3236842/ /pubmed/22166002 http://dx.doi.org/10.1186/1471-2105-12-S10-S2 Text en Copyright ©2011 Dozmorov and Wren; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Dozmorov, Mikhail G Wren, Jonathan D High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title	High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title_full	High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title_fullStr	High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title_full_unstemmed	High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title_short	High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
title_sort	high-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3236842/ https://www.ncbi.nlm.nih.gov/pubmed/22166002 http://dx.doi.org/10.1186/1471-2105-12-S10-S2
work_keys_str_mv	AT dozmorovmikhailg highthroughputprocessingandnormalizationofonecolormicroarraysfortranscriptionalmetaanalyses AT wrenjonathand highthroughputprocessingandnormalizationofonecolormicroarraysfortranscriptionalmetaanalyses

High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses

Ejemplares similares