Cargando…
Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis
BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effec...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3528008/ https://www.ncbi.nlm.nih.gov/pubmed/22682473 http://dx.doi.org/10.1186/1755-8794-5-23 |
_version_ | 1782253779449544704 |
---|---|
author | Kupfer, Peter Guthke, Reinhard Pohlers, Dirk Huber, Rene Koczan, Dirk Kinne, Raimund W |
author_facet | Kupfer, Peter Guthke, Reinhard Pohlers, Dirk Huber, Rene Koczan, Dirk Kinne, Raimund W |
author_sort | Kupfer, Peter |
collection | PubMed |
description | BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effects When Combining Batches of Gene Expression Microarray Data” (ComBat) appears to be most suitable for small sample sizes and multiple batches. METHODS: Synovial fibroblasts (SFB; purity > 98%) were obtained from rheumatoid arthritis (RA) and osteoarthritis (OA) patients (n = 6 each) and stimulated with TNF-α or TGF-β1 for 0, 1, 2, 4, or 12 hours. Gene expression was analyzed using Affymetrix Human Genome U133 Plus 2.0 chips, an alternative chip definition file, and normalization by Robust Multi-Array Analysis (RMA). Data were batch-corrected for different acquiry dates using ComBat and the efficacy of the correction was validated using hierarchical clustering. RESULTS: In contrast to the hierarchical clustering dendrogram before batch correction, in which RA and OA patients clustered randomly, batch correction led to a clear separation of RA and OA. Strikingly, this applied not only to the 0 hour time point (i.e., before stimulation with TNF-α/TGF-β1), but also to all time points following stimulation except for the late 12 hour time point. Batch-corrected data then allowed the identification of differentially expressed genes discriminating between RA and OA. Batch correction only marginally modified the original data, as demonstrated by preservation of the main Gene Ontology (GO) categories of interest, and by minimally changed mean expression levels (maximal change 4.087%) or variances for all genes of interest. Eight genes from the GO category “extracellular matrix structural constituent” (5 different collagens, biglycan, and tubulointerstitial nephritis antigen-like 1) were differentially expressed between RA and OA (RA > OA), both constitutively at time point 0, and at all time points following stimulation with either TNF-α or TGF-β1. CONCLUSION: Batch correction appears to be an extremely valuable tool to eliminate non-biological batch effects, and allows the identification of genes discriminating between different joint diseases. RA-SFB show an upregulated expression of extracellular matrix components, both constitutively following isolation from the synovial membrane and upon stimulation with disease-relevant cytokines or growth factors, suggesting an “imprinted” alteration of their phenotype. |
format | Online Article Text |
id | pubmed-3528008 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35280082012-12-21 Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis Kupfer, Peter Guthke, Reinhard Pohlers, Dirk Huber, Rene Koczan, Dirk Kinne, Raimund W BMC Med Genomics Research Article BACKGROUND: Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm “Combating Batch Effects When Combining Batches of Gene Expression Microarray Data” (ComBat) appears to be most suitable for small sample sizes and multiple batches. METHODS: Synovial fibroblasts (SFB; purity > 98%) were obtained from rheumatoid arthritis (RA) and osteoarthritis (OA) patients (n = 6 each) and stimulated with TNF-α or TGF-β1 for 0, 1, 2, 4, or 12 hours. Gene expression was analyzed using Affymetrix Human Genome U133 Plus 2.0 chips, an alternative chip definition file, and normalization by Robust Multi-Array Analysis (RMA). Data were batch-corrected for different acquiry dates using ComBat and the efficacy of the correction was validated using hierarchical clustering. RESULTS: In contrast to the hierarchical clustering dendrogram before batch correction, in which RA and OA patients clustered randomly, batch correction led to a clear separation of RA and OA. Strikingly, this applied not only to the 0 hour time point (i.e., before stimulation with TNF-α/TGF-β1), but also to all time points following stimulation except for the late 12 hour time point. Batch-corrected data then allowed the identification of differentially expressed genes discriminating between RA and OA. Batch correction only marginally modified the original data, as demonstrated by preservation of the main Gene Ontology (GO) categories of interest, and by minimally changed mean expression levels (maximal change 4.087%) or variances for all genes of interest. Eight genes from the GO category “extracellular matrix structural constituent” (5 different collagens, biglycan, and tubulointerstitial nephritis antigen-like 1) were differentially expressed between RA and OA (RA > OA), both constitutively at time point 0, and at all time points following stimulation with either TNF-α or TGF-β1. CONCLUSION: Batch correction appears to be an extremely valuable tool to eliminate non-biological batch effects, and allows the identification of genes discriminating between different joint diseases. RA-SFB show an upregulated expression of extracellular matrix components, both constitutively following isolation from the synovial membrane and upon stimulation with disease-relevant cytokines or growth factors, suggesting an “imprinted” alteration of their phenotype. BioMed Central 2012-06-08 /pmc/articles/PMC3528008/ /pubmed/22682473 http://dx.doi.org/10.1186/1755-8794-5-23 Text en Copyright ©2012 Kupfer et al.; http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Kupfer, Peter Guthke, Reinhard Pohlers, Dirk Huber, Rene Koczan, Dirk Kinne, Raimund W Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title | Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title_full | Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title_fullStr | Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title_full_unstemmed | Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title_short | Batch correction of microarray data substantially improves the identification of genes differentially expressed in Rheumatoid Arthritis and Osteoarthritis |
title_sort | batch correction of microarray data substantially improves the identification of genes differentially expressed in rheumatoid arthritis and osteoarthritis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3528008/ https://www.ncbi.nlm.nih.gov/pubmed/22682473 http://dx.doi.org/10.1186/1755-8794-5-23 |
work_keys_str_mv | AT kupferpeter batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis AT guthkereinhard batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis AT pohlersdirk batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis AT huberrene batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis AT koczandirk batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis AT kinneraimundw batchcorrectionofmicroarraydatasubstantiallyimprovestheidentificationofgenesdifferentiallyexpressedinrheumatoidarthritisandosteoarthritis |