Cargando…

Tools and best practices for data processing in allelic expression analysis

Allelic expression analysis has become important for integrating genome and transcriptome data to characterize various biological phenomena such as cis-regulatory variation and nonsense-mediated decay. We analyze the properties of allelic expression read count data and technical sources of error, su...

Descripción completa

Detalles Bibliográficos
Autores principales: Castel, Stephane E., Levy-Moonshine, Ami, Mohammadi, Pejman, Banks, Eric, Lappalainen, Tuuli
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4574606/
https://www.ncbi.nlm.nih.gov/pubmed/26381377
http://dx.doi.org/10.1186/s13059-015-0762-6
_version_ 1782390653021323264
author Castel, Stephane E.
Levy-Moonshine, Ami
Mohammadi, Pejman
Banks, Eric
Lappalainen, Tuuli
author_facet Castel, Stephane E.
Levy-Moonshine, Ami
Mohammadi, Pejman
Banks, Eric
Lappalainen, Tuuli
author_sort Castel, Stephane E.
collection PubMed
description Allelic expression analysis has become important for integrating genome and transcriptome data to characterize various biological phenomena such as cis-regulatory variation and nonsense-mediated decay. We analyze the properties of allelic expression read count data and technical sources of error, such as low-quality or double-counted RNA-seq reads, genotyping errors, allelic mapping bias, and technical covariates due to sample preparation and sequencing, and variation in total read depth. We provide guidelines for correcting such errors, show that our quality control measures improve the detection of relevant allelic expression, and introduce tools for the high-throughput production of allelic expression data from RNA-sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0762-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4574606
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45746062015-09-19 Tools and best practices for data processing in allelic expression analysis Castel, Stephane E. Levy-Moonshine, Ami Mohammadi, Pejman Banks, Eric Lappalainen, Tuuli Genome Biol Method Allelic expression analysis has become important for integrating genome and transcriptome data to characterize various biological phenomena such as cis-regulatory variation and nonsense-mediated decay. We analyze the properties of allelic expression read count data and technical sources of error, such as low-quality or double-counted RNA-seq reads, genotyping errors, allelic mapping bias, and technical covariates due to sample preparation and sequencing, and variation in total read depth. We provide guidelines for correcting such errors, show that our quality control measures improve the detection of relevant allelic expression, and introduce tools for the high-throughput production of allelic expression data from RNA-sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-015-0762-6) contains supplementary material, which is available to authorized users. BioMed Central 2015-09-17 2015 /pmc/articles/PMC4574606/ /pubmed/26381377 http://dx.doi.org/10.1186/s13059-015-0762-6 Text en © Castel et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
Castel, Stephane E.
Levy-Moonshine, Ami
Mohammadi, Pejman
Banks, Eric
Lappalainen, Tuuli
Tools and best practices for data processing in allelic expression analysis
title Tools and best practices for data processing in allelic expression analysis
title_full Tools and best practices for data processing in allelic expression analysis
title_fullStr Tools and best practices for data processing in allelic expression analysis
title_full_unstemmed Tools and best practices for data processing in allelic expression analysis
title_short Tools and best practices for data processing in allelic expression analysis
title_sort tools and best practices for data processing in allelic expression analysis
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4574606/
https://www.ncbi.nlm.nih.gov/pubmed/26381377
http://dx.doi.org/10.1186/s13059-015-0762-6
work_keys_str_mv AT castelstephanee toolsandbestpracticesfordataprocessinginallelicexpressionanalysis
AT levymoonshineami toolsandbestpracticesfordataprocessinginallelicexpressionanalysis
AT mohammadipejman toolsandbestpracticesfordataprocessinginallelicexpressionanalysis
AT bankseric toolsandbestpracticesfordataprocessinginallelicexpressionanalysis
AT lappalainentuuli toolsandbestpracticesfordataprocessinginallelicexpressionanalysis