Cargando…

Towards unified quality verification of synthetic count data with countsimQC

SUMMARY: Statistical tools for biological data analysis are often evaluated using synthetic data, designed to mimic the features of a specific type of experimental data. The generalizability of such evaluations depends on how well the synthetic data reproduce the main characteristics of the experime...

Descripción completa

Detalles Bibliográficos
Autores principales: Soneson, Charlotte, Robinson, Mark D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860609/
https://www.ncbi.nlm.nih.gov/pubmed/29028961
http://dx.doi.org/10.1093/bioinformatics/btx631
Descripción
Sumario:SUMMARY: Statistical tools for biological data analysis are often evaluated using synthetic data, designed to mimic the features of a specific type of experimental data. The generalizability of such evaluations depends on how well the synthetic data reproduce the main characteristics of the experimental data, and we argue that an assessment of this similarity should accompany any synthetic dataset used for method evaluation. We describe countsimQC, which provides a straightforward way to generate a stand-alone report that shows the main characteristics of (e.g. RNA-seq) count data and can be provided alongside a publication as verification of the appropriateness of any utilized synthetic data. AVAILABILITY AND IMPLEMENTATION: countsimQC is implemented as an R package (for R versions ≥ 3.4) and is available from https://github.com/csoneson/countsimQC under a GPL (≥2) license.