Cargando…
Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity
BACKGROUND: RNA and microarray quality assessment form an integral part of gene expression analysis and, although methods such as the RNA integrity number (RIN) algorithm reliably asses RNA integrity, the relevance of RNA integrity in gene expression analysis as well as analysis methods to accommoda...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3557148/ https://www.ncbi.nlm.nih.gov/pubmed/23324084 http://dx.doi.org/10.1186/1471-2164-14-14 |
_version_ | 1782257269651537920 |
---|---|
author | Viljoen, Katie S Blackburn, Jonathan M |
author_facet | Viljoen, Katie S Blackburn, Jonathan M |
author_sort | Viljoen, Katie S |
collection | PubMed |
description | BACKGROUND: RNA and microarray quality assessment form an integral part of gene expression analysis and, although methods such as the RNA integrity number (RIN) algorithm reliably asses RNA integrity, the relevance of RNA integrity in gene expression analysis as well as analysis methods to accommodate the possible effects of degradation requires further investigation. We investigated the relationship between RNA integrity and array quality on the commonly used Affymetrix Gene 1.0 ST array platform using reliable within-array and between-array quality assessment measures. The possibility of a transcript specific bias in the apparent effect of RNA degradation on the measured gene expression signal was evaluated after either excluding quality-flagged arrays or compensation for RNA degradation at different steps in the analysis. RESULTS: Using probe-level and inter-array quality metrics to assess 34 Gene 1.0 ST array datasets derived from historical, paired tumour and normal primary colorectal cancer samples, 7 arrays (20.6%), with a mean sample RIN of 3.2 (SD = 0.42), were flagged during array quality assessment while 10 arrays from samples with RINs < 7 passed quality assessment, including one sample with a RIN < 3. We detected a transcript length bias in RNA degradation in only 5.8% of annotated transcript clusters (p-value 0.05, FC ≥ |2|), with longer and shorter than average transcripts under- and overrepresented in quality-flagged samples respectively. Applying compensatory measures for RNA degradation performed at least as well as excluding quality-flagged arrays, as judged by hierarchical clustering, gene expression analysis and Ingenuity Pathway Analysis; importantly, use of these compensatory measures had the significant benefit of enabling lower quality array data from irreplaceable clinical samples to be retained in downstream analyses. CONCLUSIONS: Here, we demonstrate an effective array-quality assessment strategy, which will allow the user to recognize lower quality arrays that can be included in the analysis once appropriate measures are applied to account for known or unknown sources of variation, such as array quality- and batch- effects, by implementing ComBat or Surrogate Variable Analysis. This approach of quality control and analysis will be especially useful for clinical samples with variable and low RNA qualities, with RIN scores ≥ 2. |
format | Online Article Text |
id | pubmed-3557148 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35571482013-01-31 Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity Viljoen, Katie S Blackburn, Jonathan M BMC Genomics Methodology Article BACKGROUND: RNA and microarray quality assessment form an integral part of gene expression analysis and, although methods such as the RNA integrity number (RIN) algorithm reliably asses RNA integrity, the relevance of RNA integrity in gene expression analysis as well as analysis methods to accommodate the possible effects of degradation requires further investigation. We investigated the relationship between RNA integrity and array quality on the commonly used Affymetrix Gene 1.0 ST array platform using reliable within-array and between-array quality assessment measures. The possibility of a transcript specific bias in the apparent effect of RNA degradation on the measured gene expression signal was evaluated after either excluding quality-flagged arrays or compensation for RNA degradation at different steps in the analysis. RESULTS: Using probe-level and inter-array quality metrics to assess 34 Gene 1.0 ST array datasets derived from historical, paired tumour and normal primary colorectal cancer samples, 7 arrays (20.6%), with a mean sample RIN of 3.2 (SD = 0.42), were flagged during array quality assessment while 10 arrays from samples with RINs < 7 passed quality assessment, including one sample with a RIN < 3. We detected a transcript length bias in RNA degradation in only 5.8% of annotated transcript clusters (p-value 0.05, FC ≥ |2|), with longer and shorter than average transcripts under- and overrepresented in quality-flagged samples respectively. Applying compensatory measures for RNA degradation performed at least as well as excluding quality-flagged arrays, as judged by hierarchical clustering, gene expression analysis and Ingenuity Pathway Analysis; importantly, use of these compensatory measures had the significant benefit of enabling lower quality array data from irreplaceable clinical samples to be retained in downstream analyses. CONCLUSIONS: Here, we demonstrate an effective array-quality assessment strategy, which will allow the user to recognize lower quality arrays that can be included in the analysis once appropriate measures are applied to account for known or unknown sources of variation, such as array quality- and batch- effects, by implementing ComBat or Surrogate Variable Analysis. This approach of quality control and analysis will be especially useful for clinical samples with variable and low RNA qualities, with RIN scores ≥ 2. BioMed Central 2013-01-16 /pmc/articles/PMC3557148/ /pubmed/23324084 http://dx.doi.org/10.1186/1471-2164-14-14 Text en Copyright ©2013 Viljoen and Blackburn; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Viljoen, Katie S Blackburn, Jonathan M Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title | Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title_full | Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title_fullStr | Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title_full_unstemmed | Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title_short | Quality assessment and data handling methods for Affymetrix Gene 1.0 ST arrays with variable RNA integrity |
title_sort | quality assessment and data handling methods for affymetrix gene 1.0 st arrays with variable rna integrity |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3557148/ https://www.ncbi.nlm.nih.gov/pubmed/23324084 http://dx.doi.org/10.1186/1471-2164-14-14 |
work_keys_str_mv | AT viljoenkaties qualityassessmentanddatahandlingmethodsforaffymetrixgene10starrayswithvariablernaintegrity AT blackburnjonathanm qualityassessmentanddatahandlingmethodsforaffymetrixgene10starrayswithvariablernaintegrity |