Cargando…

Monitoring of Technical Variation in Quantitative High-Throughput Datasets

High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarr...

Descripción completa

Detalles Bibliográficos
Autores principales: Lauss, Martin, Visne, Ilhami, Kriegner, Albert, Ringnér, Markus, Jönsson, Göran, Höglund, Mattias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3785384/
https://www.ncbi.nlm.nih.gov/pubmed/24092958
http://dx.doi.org/10.4137/CIN.S12862
_version_ 1782477647771598848
author Lauss, Martin
Visne, Ilhami
Kriegner, Albert
Ringnér, Markus
Jönsson, Göran
Höglund, Mattias
author_facet Lauss, Martin
Visne, Ilhami
Kriegner, Albert
Ringnér, Markus
Jönsson, Göran
Höglund, Mattias
author_sort Lauss, Martin
collection PubMed
description High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step.
format Online
Article
Text
id pubmed-3785384
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-37853842013-10-03 Monitoring of Technical Variation in Quantitative High-Throughput Datasets Lauss, Martin Visne, Ilhami Kriegner, Albert Ringnér, Markus Jönsson, Göran Höglund, Mattias Cancer Inform Original Research High-dimensional datasets can be confounded by variation from technical sources, such as batches. Undetected batch effects can have severe consequences for the validity of a study’s conclusion(s). We evaluate high-throughput RNAseq and miRNAseq as well as DNA methylation and gene expression microarray datasets, mainly from the Cancer Genome Atlas (TCGA) project, in respect to technical and biological annotations. We observe technical bias in these datasets and discuss corrective interventions. We then suggest a general procedure to control study design, detect technical bias using linear regression of principal components, correct for batch effects, and re-evaluate principal components. This procedure is implemented in the R package swamp, and as graphical user interface software. In conclusion, high-throughput platforms that generate continuous measurements are sensitive to various forms of technical bias. For such data, monitoring of technical variation is an important analysis step. Libertas Academica 2013-09-23 /pmc/articles/PMC3785384/ /pubmed/24092958 http://dx.doi.org/10.4137/CIN.S12862 Text en © 2013 the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article published under the Creative Commons CC-BY-NC 3.0 license.
spellingShingle Original Research
Lauss, Martin
Visne, Ilhami
Kriegner, Albert
Ringnér, Markus
Jönsson, Göran
Höglund, Mattias
Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title_full Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title_fullStr Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title_full_unstemmed Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title_short Monitoring of Technical Variation in Quantitative High-Throughput Datasets
title_sort monitoring of technical variation in quantitative high-throughput datasets
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3785384/
https://www.ncbi.nlm.nih.gov/pubmed/24092958
http://dx.doi.org/10.4137/CIN.S12862
work_keys_str_mv AT laussmartin monitoringoftechnicalvariationinquantitativehighthroughputdatasets
AT visneilhami monitoringoftechnicalvariationinquantitativehighthroughputdatasets
AT kriegneralbert monitoringoftechnicalvariationinquantitativehighthroughputdatasets
AT ringnermarkus monitoringoftechnicalvariationinquantitativehighthroughputdatasets
AT jonssongoran monitoringoftechnicalvariationinquantitativehighthroughputdatasets
AT hoglundmattias monitoringoftechnicalvariationinquantitativehighthroughputdatasets