Cargando…
Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples
Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the va...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Research Foundation
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3286152/ https://www.ncbi.nlm.nih.gov/pubmed/22375143 http://dx.doi.org/10.3389/fgene.2012.00011 |
_version_ | 1782224529520590848 |
---|---|
author | Chow, Maggie L. Winn, Mary E. Li, Hai-Ri April, Craig Wynshaw-Boris, Anthony Fan, Jian-Bing Fu, Xiang-Dong Courchesne, Eric Schork, Nicholas J. |
author_facet | Chow, Maggie L. Winn, Mary E. Li, Hai-Ri April, Craig Wynshaw-Boris, Anthony Fan, Jian-Bing Fu, Xiang-Dong Courchesne, Eric Schork, Nicholas J. |
author_sort | Chow, Maggie L. |
collection | PubMed |
description | Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples. |
format | Online Article Text |
id | pubmed-3286152 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Frontiers Research Foundation |
record_format | MEDLINE/PubMed |
spelling | pubmed-32861522012-02-28 Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples Chow, Maggie L. Winn, Mary E. Li, Hai-Ri April, Craig Wynshaw-Boris, Anthony Fan, Jian-Bing Fu, Xiang-Dong Courchesne, Eric Schork, Nicholas J. Front Genet Genetics Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples. Frontiers Research Foundation 2012-02-24 /pmc/articles/PMC3286152/ /pubmed/22375143 http://dx.doi.org/10.3389/fgene.2012.00011 Text en Copyright © 2012 Chow, Winn, Li, April, Wynshaw-Boris, Fan, Fu, Courchesne and Schork. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited. |
spellingShingle | Genetics Chow, Maggie L. Winn, Mary E. Li, Hai-Ri April, Craig Wynshaw-Boris, Anthony Fan, Jian-Bing Fu, Xiang-Dong Courchesne, Eric Schork, Nicholas J. Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title | Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title_full | Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title_fullStr | Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title_full_unstemmed | Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title_short | Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples |
title_sort | preprocessing and quality control strategies for illumina dasl assay-based brain gene expression studies with semi-degraded samples |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3286152/ https://www.ncbi.nlm.nih.gov/pubmed/22375143 http://dx.doi.org/10.3389/fgene.2012.00011 |
work_keys_str_mv | AT chowmaggiel preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT winnmarye preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT lihairi preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT aprilcraig preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT wynshawborisanthony preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT fanjianbing preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT fuxiangdong preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT courchesneeric preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples AT schorknicholasj preprocessingandqualitycontrolstrategiesforilluminadaslassaybasedbraingeneexpressionstudieswithsemidegradedsamples |