Cargando…

BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing

MOTIVATION: While many pipelines have been developed for calling genotypes using RNA-sequencing (RNA-Seq) data, they all have adapted DNA genotype callers that do not model biases specific to RNA-Seq such as allele-specific expression (ASE). RESULTS: Here, we present Bayesian beta-binomial mixture m...

Descripción completa

Detalles Bibliográficos
Autores principales: Vigorito, Elena, Barton, Anne, Pitzalis, Costantino, Lewis, Myles J, Wallace, Chris
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318392/
https://www.ncbi.nlm.nih.gov/pubmed/37338536
http://dx.doi.org/10.1093/bioinformatics/btad393
_version_ 1785068027409072128
author Vigorito, Elena
Barton, Anne
Pitzalis, Costantino
Lewis, Myles J
Wallace, Chris
author_facet Vigorito, Elena
Barton, Anne
Pitzalis, Costantino
Lewis, Myles J
Wallace, Chris
author_sort Vigorito, Elena
collection PubMed
description MOTIVATION: While many pipelines have been developed for calling genotypes using RNA-sequencing (RNA-Seq) data, they all have adapted DNA genotype callers that do not model biases specific to RNA-Seq such as allele-specific expression (ASE). RESULTS: Here, we present Bayesian beta-binomial mixture model (BBmix), a Bayesian beta-binomial mixture model that first learns the expected distribution of read counts for each genotype, and then deploys those learned parameters to call genotypes probabilistically. We benchmarked our model on a wide variety of datasets and showed that our method generally performed better than competitors, mainly due to an increase of up to 1.4% in the accuracy of heterozygous calls, which may have a big impact in reducing false positive rate in applications sensitive to genotyping error such as ASE. Moreover, BBmix can be easily incorporated into standard pipelines for calling genotypes. We further show that parameters are generally transferable within datasets, such that a single learning run of less than 1 h is sufficient to call genotypes in a large number of samples. AVAILABILITY AND IMPLEMENTATION: We implemented BBmix as an R package that is available for free under a GPL-2 licence at https://gitlab.com/evigorito/bbmix and https://cran.r-project.org/package=bbmix with accompanying pipeline at https://gitlab.com/evigorito/bbmix_pipeline.
format Online
Article
Text
id pubmed-10318392
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103183922023-07-05 BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing Vigorito, Elena Barton, Anne Pitzalis, Costantino Lewis, Myles J Wallace, Chris Bioinformatics Original Paper MOTIVATION: While many pipelines have been developed for calling genotypes using RNA-sequencing (RNA-Seq) data, they all have adapted DNA genotype callers that do not model biases specific to RNA-Seq such as allele-specific expression (ASE). RESULTS: Here, we present Bayesian beta-binomial mixture model (BBmix), a Bayesian beta-binomial mixture model that first learns the expected distribution of read counts for each genotype, and then deploys those learned parameters to call genotypes probabilistically. We benchmarked our model on a wide variety of datasets and showed that our method generally performed better than competitors, mainly due to an increase of up to 1.4% in the accuracy of heterozygous calls, which may have a big impact in reducing false positive rate in applications sensitive to genotyping error such as ASE. Moreover, BBmix can be easily incorporated into standard pipelines for calling genotypes. We further show that parameters are generally transferable within datasets, such that a single learning run of less than 1 h is sufficient to call genotypes in a large number of samples. AVAILABILITY AND IMPLEMENTATION: We implemented BBmix as an R package that is available for free under a GPL-2 licence at https://gitlab.com/evigorito/bbmix and https://cran.r-project.org/package=bbmix with accompanying pipeline at https://gitlab.com/evigorito/bbmix_pipeline. Oxford University Press 2023-06-20 /pmc/articles/PMC10318392/ /pubmed/37338536 http://dx.doi.org/10.1093/bioinformatics/btad393 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Vigorito, Elena
Barton, Anne
Pitzalis, Costantino
Lewis, Myles J
Wallace, Chris
BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title_full BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title_fullStr BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title_full_unstemmed BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title_short BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing
title_sort bbmix: a bayesian beta-binomial mixture model for accurate genotyping from rna-sequencing
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10318392/
https://www.ncbi.nlm.nih.gov/pubmed/37338536
http://dx.doi.org/10.1093/bioinformatics/btad393
work_keys_str_mv AT vigoritoelena bbmixabayesianbetabinomialmixturemodelforaccurategenotypingfromrnasequencing
AT bartonanne bbmixabayesianbetabinomialmixturemodelforaccurategenotypingfromrnasequencing
AT pitzaliscostantino bbmixabayesianbetabinomialmixturemodelforaccurategenotypingfromrnasequencing
AT lewismylesj bbmixabayesianbetabinomialmixturemodelforaccurategenotypingfromrnasequencing
AT wallacechris bbmixabayesianbetabinomialmixturemodelforaccurategenotypingfromrnasequencing