Cargando…

Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties

In view of potential application to biomedical diagnosis, tight transcriptome data quality control is compulsory. Usually, quality control is achieved using labeling and hybridization controls added at different stages throughout the processing of the biologic RNA samples. These control measures, ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Brysbaert, Guillaume, Pellay, François-Xavier, Noth, Sebastian, Benecke, Arndt
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054119/
https://www.ncbi.nlm.nih.gov/pubmed/20451162
http://dx.doi.org/10.1016/S1672-0229(10)60006-X
_version_ 1782458530753675264
author Brysbaert, Guillaume
Pellay, François-Xavier
Noth, Sebastian
Benecke, Arndt
author_facet Brysbaert, Guillaume
Pellay, François-Xavier
Noth, Sebastian
Benecke, Arndt
author_sort Brysbaert, Guillaume
collection PubMed
description In view of potential application to biomedical diagnosis, tight transcriptome data quality control is compulsory. Usually, quality control is achieved using labeling and hybridization controls added at different stages throughout the processing of the biologic RNA samples. These control measures, however, only reflect the performance of the individual technical manipulations during the entire process and have no bearing as to the continued integrity of the RNA sample itself. Here we demonstrate that intrinsic statistical properties of the resulting transcriptome data signal and signal-variance distributions and their invariance can be identified independently of the animal species studied and the labeling protocol used. From these invariant properties we have developed a data model, the parameters of which can be estimated from individual experiments and used to compute relative quality measures based on similarity with large reference datasets. These quality measures add supplementary, non-redundant information to standard quality control estimates based on spike-in and hybridization controls, and are exploitable in data analysis. A software application for analyzing datasets as well as a reference dataset for AB1700 arrays are provided. They should allow AB1700 users to easily integrate this method into their analysis pipeline, and might instigate similar developments for other transcriptome platforms.
format Online
Article
Text
id pubmed-5054119
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-50541192016-10-14 Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties Brysbaert, Guillaume Pellay, François-Xavier Noth, Sebastian Benecke, Arndt Genomics Proteomics Bioinformatics Article In view of potential application to biomedical diagnosis, tight transcriptome data quality control is compulsory. Usually, quality control is achieved using labeling and hybridization controls added at different stages throughout the processing of the biologic RNA samples. These control measures, however, only reflect the performance of the individual technical manipulations during the entire process and have no bearing as to the continued integrity of the RNA sample itself. Here we demonstrate that intrinsic statistical properties of the resulting transcriptome data signal and signal-variance distributions and their invariance can be identified independently of the animal species studied and the labeling protocol used. From these invariant properties we have developed a data model, the parameters of which can be estimated from individual experiments and used to compute relative quality measures based on similarity with large reference datasets. These quality measures add supplementary, non-redundant information to standard quality control estimates based on spike-in and hybridization controls, and are exploitable in data analysis. A software application for analyzing datasets as well as a reference dataset for AB1700 arrays are provided. They should allow AB1700 users to easily integrate this method into their analysis pipeline, and might instigate similar developments for other transcriptome platforms. Elsevier 2010-03 2010-05-05 /pmc/articles/PMC5054119/ /pubmed/20451162 http://dx.doi.org/10.1016/S1672-0229(10)60006-X Text en © 2010 Beijing Institute of Genomics http://creativecommons.org/licenses/by-nc-sa/3.0/ This is an open access article under the CC BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
spellingShingle Article
Brysbaert, Guillaume
Pellay, François-Xavier
Noth, Sebastian
Benecke, Arndt
Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title_full Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title_fullStr Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title_full_unstemmed Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title_short Quality Assessment of Transcriptome Data Using Intrinsic Statistical Properties
title_sort quality assessment of transcriptome data using intrinsic statistical properties
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054119/
https://www.ncbi.nlm.nih.gov/pubmed/20451162
http://dx.doi.org/10.1016/S1672-0229(10)60006-X
work_keys_str_mv AT brysbaertguillaume qualityassessmentoftranscriptomedatausingintrinsicstatisticalproperties
AT pellayfrancoisxavier qualityassessmentoftranscriptomedatausingintrinsicstatisticalproperties
AT nothsebastian qualityassessmentoftranscriptomedatausingintrinsicstatisticalproperties
AT beneckearndt qualityassessmentoftranscriptomedatausingintrinsicstatisticalproperties