Cargando…

CONFINED: distinguishing biological from technical sources of variation by leveraging multiple methylation datasets

Methylation datasets are affected by innumerable sources of variability, both biological (cell-type composition, genetics) and technical (batch effects). Here, we propose a reference-free method based on sparse canonical correlation analysis to separate the biological from technical sources of varia...

Descripción completa

Detalles Bibliográficos
Autores principales: Thompson, Mike, Chen, Zeyuan Johnson, Rahmani, Elior, Halperin, Eran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6624895/
https://www.ncbi.nlm.nih.gov/pubmed/31300005
http://dx.doi.org/10.1186/s13059-019-1743-y
Descripción
Sumario:Methylation datasets are affected by innumerable sources of variability, both biological (cell-type composition, genetics) and technical (batch effects). Here, we propose a reference-free method based on sparse canonical correlation analysis to separate the biological from technical sources of variability. We show through simulations and real data that our method, CONFINED, is not only more accurate than the state-of-the-art reference-free methods for capturing known, replicable biological variability, but it is also considerably more robust to dataset-specific technical variability than previous approaches. CONFINED is available as an R package as detailed at https://github.com/cozygene/CONFINED. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-019-1743-y) contains supplementary material, which is available to authorized users.