Cargando…

Highly accurate quantification of allelic gene expression for population and disease genetics

Analysis of allele-specific gene expression (ASE) is a powerful approach for studying gene regulation, particularly when sample sizes are small, such as for rare diseases, or when studying the effects of rare genetic variation. However, detection of ASE events relies on accurate alignment of RNA seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Saukkonen, Anna, Kilpinen, Helena, Hodgkinson, Alan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9435737/
https://www.ncbi.nlm.nih.gov/pubmed/35794008
http://dx.doi.org/10.1101/gr.276296.121
_version_ 1784781215225610240
author Saukkonen, Anna
Kilpinen, Helena
Hodgkinson, Alan
author_facet Saukkonen, Anna
Kilpinen, Helena
Hodgkinson, Alan
author_sort Saukkonen, Anna
collection PubMed
description Analysis of allele-specific gene expression (ASE) is a powerful approach for studying gene regulation, particularly when sample sizes are small, such as for rare diseases, or when studying the effects of rare genetic variation. However, detection of ASE events relies on accurate alignment of RNA sequencing reads, where challenges still remain, particularly for reads containing genetic variants or those that align to many different genomic locations. We have developed the Personalised ASE Caller (PAC), a tool that combines multiple steps to improve the quantification of allelic reads, including personalized (i.e., diploid) read alignment with improved allocation of multimapping reads. Using simulated RNA sequencing data, we show that PAC outperforms standard alignment approaches for ASE detection, reducing the number of sites with incorrect biases (>10%) by ∼80% and increasing the number of sites that can be reliably quantified by ∼3%. Applying PAC to real RNA sequencing data from 670 whole-blood samples, we show that genetic regulatory signatures inferred from ASE data more closely match those from population-based methods that are less prone to alignment biases. Finally, we use PAC to characterize cell type–specific ASE events that would be missed by standard alignment approaches, and in doing so identify disease relevant genes that may modulate their effects through the regulation of gene expression. PAC can be applied to the vast quantity of existing RNA sequencing data sets to better understand a wide array of fundamental biological and disease processes.
format Online
Article
Text
id pubmed-9435737
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-94357372022-09-16 Highly accurate quantification of allelic gene expression for population and disease genetics Saukkonen, Anna Kilpinen, Helena Hodgkinson, Alan Genome Res Method Analysis of allele-specific gene expression (ASE) is a powerful approach for studying gene regulation, particularly when sample sizes are small, such as for rare diseases, or when studying the effects of rare genetic variation. However, detection of ASE events relies on accurate alignment of RNA sequencing reads, where challenges still remain, particularly for reads containing genetic variants or those that align to many different genomic locations. We have developed the Personalised ASE Caller (PAC), a tool that combines multiple steps to improve the quantification of allelic reads, including personalized (i.e., diploid) read alignment with improved allocation of multimapping reads. Using simulated RNA sequencing data, we show that PAC outperforms standard alignment approaches for ASE detection, reducing the number of sites with incorrect biases (>10%) by ∼80% and increasing the number of sites that can be reliably quantified by ∼3%. Applying PAC to real RNA sequencing data from 670 whole-blood samples, we show that genetic regulatory signatures inferred from ASE data more closely match those from population-based methods that are less prone to alignment biases. Finally, we use PAC to characterize cell type–specific ASE events that would be missed by standard alignment approaches, and in doing so identify disease relevant genes that may modulate their effects through the regulation of gene expression. PAC can be applied to the vast quantity of existing RNA sequencing data sets to better understand a wide array of fundamental biological and disease processes. Cold Spring Harbor Laboratory Press 2022-08 /pmc/articles/PMC9435737/ /pubmed/35794008 http://dx.doi.org/10.1101/gr.276296.121 Text en © 2022 Saukkonen et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Method
Saukkonen, Anna
Kilpinen, Helena
Hodgkinson, Alan
Highly accurate quantification of allelic gene expression for population and disease genetics
title Highly accurate quantification of allelic gene expression for population and disease genetics
title_full Highly accurate quantification of allelic gene expression for population and disease genetics
title_fullStr Highly accurate quantification of allelic gene expression for population and disease genetics
title_full_unstemmed Highly accurate quantification of allelic gene expression for population and disease genetics
title_short Highly accurate quantification of allelic gene expression for population and disease genetics
title_sort highly accurate quantification of allelic gene expression for population and disease genetics
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9435737/
https://www.ncbi.nlm.nih.gov/pubmed/35794008
http://dx.doi.org/10.1101/gr.276296.121
work_keys_str_mv AT saukkonenanna highlyaccuratequantificationofallelicgeneexpressionforpopulationanddiseasegenetics
AT kilpinenhelena highlyaccuratequantificationofallelicgeneexpressionforpopulationanddiseasegenetics
AT hodgkinsonalan highlyaccuratequantificationofallelicgeneexpressionforpopulationanddiseasegenetics