Cargando…

Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data

Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data beca...

Descripción completa

Detalles Bibliográficos
Autores principales: Sanchez-Taltavull, Daniel, Perkins, Theodore J, Dommann, Noelle, Melin, Nicolas, Keogh, Adrian, Candinas, Daniel, Stroka, Deborah, Beldi, Guido
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671344/
https://www.ncbi.nlm.nih.gov/pubmed/33575552
http://dx.doi.org/10.1093/nargab/lqaa002
_version_ 1783610911597002752
author Sanchez-Taltavull, Daniel
Perkins, Theodore J
Dommann, Noelle
Melin, Nicolas
Keogh, Adrian
Candinas, Daniel
Stroka, Deborah
Beldi, Guido
author_facet Sanchez-Taltavull, Daniel
Perkins, Theodore J
Dommann, Noelle
Melin, Nicolas
Keogh, Adrian
Candinas, Daniel
Stroka, Deborah
Beldi, Guido
author_sort Sanchez-Taltavull, Daniel
collection PubMed
description Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to assess similarity for bulk RNA-seq. Our goal is to extend the properties of the Bayesian correlation in scRNA-seq data by considering three ways to compute similarity. First, we compute the similarity of pairs of genes over all cells. Second, we identify specific cell populations and compute the correlation in those populations. Third, we compute the similarity of pairs of genes over all clusters, by considering the total mRNA expression. We demonstrate that Bayesian correlations are more reproducible than Pearson correlations. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We show that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. We conclude that Bayesian correlation is a robust similarity measure in scRNA-seq data.
format Online
Article
Text
id pubmed-7671344
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76713442021-02-10 Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data Sanchez-Taltavull, Daniel Perkins, Theodore J Dommann, Noelle Melin, Nicolas Keogh, Adrian Candinas, Daniel Stroka, Deborah Beldi, Guido NAR Genom Bioinform Methods Article Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to assess similarity for bulk RNA-seq. Our goal is to extend the properties of the Bayesian correlation in scRNA-seq data by considering three ways to compute similarity. First, we compute the similarity of pairs of genes over all cells. Second, we identify specific cell populations and compute the correlation in those populations. Third, we compute the similarity of pairs of genes over all clusters, by considering the total mRNA expression. We demonstrate that Bayesian correlations are more reproducible than Pearson correlations. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We show that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. We conclude that Bayesian correlation is a robust similarity measure in scRNA-seq data. Oxford University Press 2020-01-24 /pmc/articles/PMC7671344/ /pubmed/33575552 http://dx.doi.org/10.1093/nargab/lqaa002 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Article
Sanchez-Taltavull, Daniel
Perkins, Theodore J
Dommann, Noelle
Melin, Nicolas
Keogh, Adrian
Candinas, Daniel
Stroka, Deborah
Beldi, Guido
Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title_full Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title_fullStr Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title_full_unstemmed Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title_short Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data
title_sort bayesian correlation is a robust gene similarity measure for single-cell rna-seq data
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671344/
https://www.ncbi.nlm.nih.gov/pubmed/33575552
http://dx.doi.org/10.1093/nargab/lqaa002
work_keys_str_mv AT sancheztaltavulldaniel bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT perkinstheodorej bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT dommannnoelle bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT melinnicolas bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT keoghadrian bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT candinasdaniel bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT strokadeborah bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata
AT beldiguido bayesiancorrelationisarobustgenesimilaritymeasureforsinglecellrnaseqdata