Cargando…

TBIO-11. DEEP LEARNING-BASED SINGLE-CELL RNA SEQUENCING DIFFERENTIATION IDENTIFIES SIMPLE AND COMPLEX TRANSCRIPTIONAL NETWORKS FOR SUBPOPULATION CLASSIFICATION

BACKGROUND: Genomic assays capable of cellular resolution (i.e. scRNA-seq) are becoming ubiquitous in biomedical research. Machine learning, and the subtype known as Deep Learning, have broad application within scRNA-seq analytics. However, methods to facilitate the classification of cell population...

Descripción completa

Detalles Bibliográficos
Autores principales: Prince, Eric, Hankinson, Todd
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7715250/
http://dx.doi.org/10.1093/neuonc/noaa222.838
Descripción
Sumario:BACKGROUND: Genomic assays capable of cellular resolution (i.e. scRNA-seq) are becoming ubiquitous in biomedical research. Machine learning, and the subtype known as Deep Learning, have broad application within scRNA-seq analytics. However, methods to facilitate the classification of cell populations are lacking. We present the novel computational framework HD Spot, which generates interpretable and robust Deep Learning classifiers that enable unbiased interrogation of linear and non-linear genomic signatures. METHODS: HD Spot is written in python and relies on Google’s TensorFlow2 deep learning framework. Four datasets of immune cells were obtained from the publicly available Seurat repository, generated using the 10X chromium platform. Data preprocessing used standard Seurat methodology. HD Spot generated optimized classifiers via a custom platform. Network interpretability was achieved using Shapley values. Ontology analysis was performed using Metascape. RESULTS: HD Spot identified meaningful ontologic signatures across all tested datasets. In the binary case of control versus IFN-B stimulated CD4+ T cells, gene ontologies reflected T(h0) and T(h2) T cell populations, congruent with T cell activation. In the 9-class case of PBMCs, HD Spot identified meaningful gene networks characteristic of the ground-truth populations using raw feature counts alone. When feature counts are processed into expression values, HD Spot demonstrates increased specificity of top genes and respective ontologies between subpopulations. CONCLUSION: This work introduces a broadly applicable computational tool for the advanced bioinformatician to decipher complex cellular heterogeneity (e.g., tumors) in an unbiased way. Additionally, HD Spot lowers the barrier for novice bioinformaticists to derive actionable insights from their data.