Cargando…

SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data

BACKGROUND: Recent studies have demonstrated the utility of scRNA-seq SNVs to distinguish tumor from normal cells, characterize intra-tumoral heterogeneity, and define mutation-associated expression signatures. In addition to cancer studies, SNVs from single cells have been useful in studies of tran...

Descripción completa

Detalles Bibliográficos
Autores principales: Prashant, N. M., Alomran, Nawaf, Chen, Yu, Liu, Hongyu, Bousounis, Pavlos, Movassagh, Mercedeh, Edwards, Nathan, Horvath, Anelia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459565/
https://www.ncbi.nlm.nih.gov/pubmed/34551708
http://dx.doi.org/10.1186/s12864-021-07974-8
_version_ 1784571552342212608
author Prashant, N. M.
Alomran, Nawaf
Chen, Yu
Liu, Hongyu
Bousounis, Pavlos
Movassagh, Mercedeh
Edwards, Nathan
Horvath, Anelia
author_facet Prashant, N. M.
Alomran, Nawaf
Chen, Yu
Liu, Hongyu
Bousounis, Pavlos
Movassagh, Mercedeh
Edwards, Nathan
Horvath, Anelia
author_sort Prashant, N. M.
collection PubMed
description BACKGROUND: Recent studies have demonstrated the utility of scRNA-seq SNVs to distinguish tumor from normal cells, characterize intra-tumoral heterogeneity, and define mutation-associated expression signatures. In addition to cancer studies, SNVs from single cells have been useful in studies of transcriptional burst kinetics, allelic expression, chromosome X inactivation, ploidy estimations, and haplotype inference. RESULTS: To aid these types of studies, we have developed a tool, SCReadCounts, for cell-level tabulation of the sequencing read counts bearing SNV reference and variant alleles from barcoded scRNA-seq alignments. Provided genomic loci and expected alleles, SCReadCounts generates cell-SNV matrices with the absolute variant- and reference-harboring read counts, as well as cell-SNV matrices of expressed Variant Allele Fraction (VAF(RNA)) suitable for a variety of downstream applications. We demonstrate three different SCReadCounts applications on 59,884 cells from seven neuroblastoma samples: (1) estimation of cell-level expression of known somatic mutations and RNA-editing sites, (2) estimation of cell- level allele expression of biallelic SNVs, and (3) a discovery mode assessment of the reference and each of the three alternative nucleotides at genomic positions of interest that does not require prior SNV information. For the later, we applied SCReadCounts on the coding regions of KRAS, where it identified known and novel somatic mutations in a low-to-moderate proportion of cells. The SCReadCounts read counts module is benchmarked against the analogous modules of GATK and Samtools. SCReadCounts is freely available (https://github.com/HorvathLab/NGS) as 64-bit self-contained binary distributions for Linux and MacOS, in addition to Python source. CONCLUSIONS: SCReadCounts supplies a fast and efficient solution for estimation of cell-level SNV expression from scRNA-seq data. SCReadCounts enables distinguishing cells with monoallelic reference expression from those with no gene expression and is applicable to assess SNVs present in only a small proportion of the cells, such as somatic mutations in cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07974-8.
format Online
Article
Text
id pubmed-8459565
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84595652021-09-23 SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data Prashant, N. M. Alomran, Nawaf Chen, Yu Liu, Hongyu Bousounis, Pavlos Movassagh, Mercedeh Edwards, Nathan Horvath, Anelia BMC Genomics Methodology Article BACKGROUND: Recent studies have demonstrated the utility of scRNA-seq SNVs to distinguish tumor from normal cells, characterize intra-tumoral heterogeneity, and define mutation-associated expression signatures. In addition to cancer studies, SNVs from single cells have been useful in studies of transcriptional burst kinetics, allelic expression, chromosome X inactivation, ploidy estimations, and haplotype inference. RESULTS: To aid these types of studies, we have developed a tool, SCReadCounts, for cell-level tabulation of the sequencing read counts bearing SNV reference and variant alleles from barcoded scRNA-seq alignments. Provided genomic loci and expected alleles, SCReadCounts generates cell-SNV matrices with the absolute variant- and reference-harboring read counts, as well as cell-SNV matrices of expressed Variant Allele Fraction (VAF(RNA)) suitable for a variety of downstream applications. We demonstrate three different SCReadCounts applications on 59,884 cells from seven neuroblastoma samples: (1) estimation of cell-level expression of known somatic mutations and RNA-editing sites, (2) estimation of cell- level allele expression of biallelic SNVs, and (3) a discovery mode assessment of the reference and each of the three alternative nucleotides at genomic positions of interest that does not require prior SNV information. For the later, we applied SCReadCounts on the coding regions of KRAS, where it identified known and novel somatic mutations in a low-to-moderate proportion of cells. The SCReadCounts read counts module is benchmarked against the analogous modules of GATK and Samtools. SCReadCounts is freely available (https://github.com/HorvathLab/NGS) as 64-bit self-contained binary distributions for Linux and MacOS, in addition to Python source. CONCLUSIONS: SCReadCounts supplies a fast and efficient solution for estimation of cell-level SNV expression from scRNA-seq data. SCReadCounts enables distinguishing cells with monoallelic reference expression from those with no gene expression and is applicable to assess SNVs present in only a small proportion of the cells, such as somatic mutations in cancer. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07974-8. BioMed Central 2021-09-22 /pmc/articles/PMC8459565/ /pubmed/34551708 http://dx.doi.org/10.1186/s12864-021-07974-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Prashant, N. M.
Alomran, Nawaf
Chen, Yu
Liu, Hongyu
Bousounis, Pavlos
Movassagh, Mercedeh
Edwards, Nathan
Horvath, Anelia
SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title_full SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title_fullStr SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title_full_unstemmed SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title_short SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data
title_sort screadcounts: estimation of cell-level snvs expression from scrna-seq data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459565/
https://www.ncbi.nlm.nih.gov/pubmed/34551708
http://dx.doi.org/10.1186/s12864-021-07974-8
work_keys_str_mv AT prashantnm screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT alomrannawaf screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT chenyu screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT liuhongyu screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT bousounispavlos screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT movassaghmercedeh screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT edwardsnathan screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata
AT horvathanelia screadcountsestimationofcelllevelsnvsexpressionfromscrnaseqdata