Cargando…
Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data beca...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671344/ https://www.ncbi.nlm.nih.gov/pubmed/33575552 http://dx.doi.org/10.1093/nargab/lqaa002 |
_version_ | 1783610911597002752 |
---|---|
author | Sanchez-Taltavull, Daniel Perkins, Theodore J Dommann, Noelle Melin, Nicolas Keogh, Adrian Candinas, Daniel Stroka, Deborah Beldi, Guido |
author_facet | Sanchez-Taltavull, Daniel Perkins, Theodore J Dommann, Noelle Melin, Nicolas Keogh, Adrian Candinas, Daniel Stroka, Deborah Beldi, Guido |
author_sort | Sanchez-Taltavull, Daniel |
collection | PubMed |
description | Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to assess similarity for bulk RNA-seq. Our goal is to extend the properties of the Bayesian correlation in scRNA-seq data by considering three ways to compute similarity. First, we compute the similarity of pairs of genes over all cells. Second, we identify specific cell populations and compute the correlation in those populations. Third, we compute the similarity of pairs of genes over all clusters, by considering the total mRNA expression. We demonstrate that Bayesian correlations are more reproducible than Pearson correlations. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We show that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. We conclude that Bayesian correlation is a robust similarity measure in scRNA-seq data. |
format | Online Article Text |
id | pubmed-7671344 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76713442021-02-10 Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data Sanchez-Taltavull, Daniel Perkins, Theodore J Dommann, Noelle Melin, Nicolas Keogh, Adrian Candinas, Daniel Stroka, Deborah Beldi, Guido NAR Genom Bioinform Methods Article Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to assess similarity for bulk RNA-seq. Our goal is to extend the properties of the Bayesian correlation in scRNA-seq data by considering three ways to compute similarity. First, we compute the similarity of pairs of genes over all cells. Second, we identify specific cell populations and compute the correlation in those populations. Third, we compute the similarity of pairs of genes over all clusters, by considering the total mRNA expression. We demonstrate that Bayesian correlations are more reproducible than Pearson correlations. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We show that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. We conclude that Bayesian correlation is a robust similarity measure in scRNA-seq data. Oxford University Press 2020-01-24 /pmc/articles/PMC7671344/ /pubmed/33575552 http://dx.doi.org/10.1093/nargab/lqaa002 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Article Sanchez-Taltavull, Daniel Perkins, Theodore J Dommann, Noelle Melin, Nicolas Keogh, Adrian Candinas, Daniel Stroka, Deborah Beldi, Guido Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title | Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title_full | Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title_fullStr | Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title_full_unstemmed | Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title_short | Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data |
title_sort | bayesian correlation is a robust gene similarity measure for single-cell rna-seq data |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671344/ https://www.ncbi.nlm.nih.gov/pubmed/33575552 http://dx.doi.org/10.1093/nargab/lqaa002 |
work_keys_str_mv | AT sancheztaltavulldaniel bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT perkinstheodorej bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT dommannnoelle bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT melinnicolas bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT keoghadrian bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT candinasdaniel bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT strokadeborah bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata AT beldiguido bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata |