Cargando…
Monitoring of Technical Variation in Quantitative High-Throughput Datasets
High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarr...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3785384/ https://www.ncbi.nlm.nih.gov/pubmed/24092958 http://dx.doi.org/10.4137/CIN.S12862 |
_version_ | 1782477647771598848 |
---|---|
author | Lauss, Martin Visne, Ilhami Kriegner, Albert Ringnér, Markus Jönsson, Göran Höglund, Mattias |
author_facet | Lauss, Martin Visne, Ilhami Kriegner, Albert Ringnér, Markus Jönsson, Göran Höglund, Mattias |
author_sort | Lauss, Martin |
collection | PubMed |
description | High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step. |
format | Online Article Text |
id | pubmed-3785384 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-37853842013-10-03 Monitoring of Technical Variation in Quantitative High-Throughput Datasets Lauss, Martin Visne, Ilhami Kriegner, Albert Ringnér, Markus Jönsson, Göran Höglund, Mattias Cancer Inform Original Research High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step. Libertas Academica 2013-09-23 /pmc/articles/PMC3785384/ /pubmed/24092958 http://dx.doi.org/10.4137/CIN.S12862 Text en © 2013 the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article published under the Creative Commons CC-BY-NC 3.0 license. |
spellingShingle | Original Research Lauss, Martin Visne, Ilhami Kriegner, Albert Ringnér, Markus Jönsson, Göran Höglund, Mattias Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title | Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title_full | Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title_fullStr | Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title_full_unstemmed | Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title_short | Monitoring of Technical Variation in Quantitative High-Throughput Datasets |
title_sort | monitoring of technical variation in quantitative high-throughput datasets |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3785384/ https://www.ncbi.nlm.nih.gov/pubmed/24092958 http://dx.doi.org/10.4137/CIN.S12862 |
work_keys_str_mv | AT laussmartin monitoringoftechnicalvariationinquantitativehighthroughputdatasets AT visneilhami monitoringoftechnicalvariationinquantitativehighthroughputdatasets AT kriegneralbert monitoringoftechnicalvariationinquantitativehighthroughputdatasets AT ringnermarkus monitoringoftechnicalvariationinquantitativehighthroughputdatasets AT jonssongoran monitoringoftechnicalvariationinquantitativehighthroughputdatasets AT hoglundmattias monitoringoftechnicalvariationinquantitativehighthroughputdatasets |