Cargando…

Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis

BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effec...

Descripción completa

Detalles Bibliográficos
Autores principales: Kupfer, Peter, Guthke, Reinhard, Pohlers, Dirk, Huber, Rene, Koczan, Dirk, Kinne, Raimund W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3528008/
https://www.ncbi.nlm.nih.gov/pubmed/22682473
http://dx.doi.org/10.1186/1755-8794-5-23
_version_ 1782253779449544704
author Kupfer, Peter
Guthke, Reinhard
Pohlers, Dirk
Huber, Rene
Koczan, Dirk
Kinne, Raimund W
author_facet Kupfer, Peter
Guthke, Reinhard
Pohlers, Dirk
Huber, Rene
Koczan, Dirk
Kinne, Raimund W
author_sort Kupfer, Peter
collection PubMed
description BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effects When Combining Batches of Gene Expression Microarray Data” (ComBat) appears to be most suitable for small sample sizes and multiple batches. METHODS: Synovial fibroblasts (SFB; purity > 98%) were obtained from rheumatoid arthritis (RA) and osteoarthritis (OA) patients (n = 6 each) and stimulated with TNF-α or TGF-β1 for 0, 1, 2, 4, or 12 hours. Gene expression was analyzed using Affymetrix Human Genome U133 Plus 2.0 chips, an alternative chip definition file, and normalization by Robust Multi-Array Analysis (RMA). Data were batch-corrected for different acquiry dates using ComBat and the efficacy of the correction was validated using hierarchical clustering. RESULTS: In contrast to the hierarchical clustering dendrogram before batch correction, in which RA and OA patients clustered randomly, batch correction led to a clear separation of RA and OA. Strikingly, this applied not only to the 0 hour time point (i.e., before stimulation with TNF-α/TGF-β1), but also to all time points following stimulation except for the late 12 hour time point. Batch-corrected data then allowed the identification of differentially expressed genes discriminating between RA and OA. Batch correction only marginally modified the original data, as demonstrated by preservation of the main Gene Ontology (GO) categories of interest, and by minimally changed mean expression levels (maximal change 4.087%) or variances for all genes of interest. Eight genes from the GO category “extracellular matrix structural constituent” (5 different collagens, biglycan, and tubulointerstitial nephritis antigen-like 1) were differentially expressed between RA and OA (RA > OA), both constitutively at time point 0, and at all time points following stimulation with either TNF-α or TGF-β1. CONCLUSION: Batch correction appears to be an extremely valuable tool to eliminate non-biological batch effects, and allows the identification of genes discriminating between different joint diseases. RA-SFB show an upregulated expression of extracellular matrix components, both constitutively following isolation from the synovial membrane and upon stimulation with disease-relevant cytokines or growth factors, suggesting an “imprinted” alteration of their phenotype.
format Online
Article
Text
id pubmed-3528008
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35280082012-12-21 Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis Kupfer, Peter Guthke, Reinhard Pohlers, Dirk Huber, Rene Koczan, Dirk Kinne, Raimund W BMC Med Genomics Research Article BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effects When Combining Batches of Gene Expression Microarray Data” (ComBat) appears to be most suitable for small sample sizes and multiple batches. METHODS: Synovial fibroblasts (SFB; purity > 98%) were obtained from rheumatoid arthritis (RA) and osteoarthritis (OA) patients (n = 6 each) and stimulated with TNF-α or TGF-β1 for 0, 1, 2, 4, or 12 hours. Gene expression was analyzed using Affymetrix Human Genome U133 Plus 2.0 chips, an alternative chip definition file, and normalization by Robust Multi-Array Analysis (RMA). Data were batch-corrected for different acquiry dates using ComBat and the efficacy of the correction was validated using hierarchical clustering. RESULTS: In contrast to the hierarchical clustering dendrogram before batch correction, in which RA and OA patients clustered randomly, batch correction led to a clear separation of RA and OA. Strikingly, this applied not only to the 0 hour time point (i.e., before stimulation with TNF-α/TGF-β1), but also to all time points following stimulation except for the late 12 hour time point. Batch-corrected data then allowed the identification of differentially expressed genes discriminating between RA and OA. Batch correction only marginally modified the original data, as demonstrated by preservation of the main Gene Ontology (GO) categories of interest, and by minimally changed mean expression levels (maximal change 4.087%) or variances for all genes of interest. Eight genes from the GO category “extracellular matrix structural constituent” (5 different collagens, biglycan, and tubulointerstitial nephritis antigen-like 1) were differentially expressed between RA and OA (RA > OA), both constitutively at time point 0, and at all time points following stimulation with either TNF-α or TGF-β1. CONCLUSION: Batch correction appears to be an extremely valuable tool to eliminate non-biological batch effects, and allows the identification of genes discriminating between different joint diseases. RA-SFB show an upregulated expression of extracellular matrix components, both constitutively following isolation from the synovial membrane and upon stimulation with disease-relevant cytokines or growth factors, suggesting an “imprinted” alteration of their phenotype. BioMed Central 2012-06-08 /pmc/articles/PMC3528008/ /pubmed/22682473 http://dx.doi.org/10.1186/1755-8794-5-23 Text en Copyright ©2012 Kupfer et al.; http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kupfer, Peter
Guthke, Reinhard
Pohlers, Dirk
Huber, Rene
Koczan, Dirk
Kinne, Raimund W
Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title_full Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title_fullStr Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title_full_unstemmed Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title_short Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
title_sort batch correction of microarray data substantially improves the identification of genes differentially expressed in rheumatoid arthritis and osteoarthritis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3528008/
https://www.ncbi.nlm.nih.gov/pubmed/22682473
http://dx.doi.org/10.1186/1755-8794-5-23
work_keys_str_mv AT kupferpeter batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis
AT guthkereinhard batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis
AT pohlersdirk batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis
AT huberrene batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis
AT koczandirk batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis
AT kinneraimundw batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis