Cargando…
Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation
BACKGROUND: Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3′ ends. Most APA occurs within 3′ UTRs, which harbor regulatory elements that can impact mRNA stability, translation,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8518154/ https://www.ncbi.nlm.nih.gov/pubmed/34649612 http://dx.doi.org/10.1186/s13059-021-02502-z |
_version_ | 1784584164172890112 |
---|---|
author | Shah, Ankeeta Mittleman, Briana E. Gilad, Yoav Li, Yang I. |
author_facet | Shah, Ankeeta Mittleman, Briana E. Gilad, Yoav Li, Yang I. |
author_sort | Shah, Ankeeta |
collection | PubMed |
description | BACKGROUND: Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3′ ends. Most APA occurs within 3′ UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. RESULTS: APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools—TAPAS, QAPA, DaPars2, GETUTR, and APATrap— against 3′-Seq, a specialized RNA-seq protocol that enriches for reads at the 3′ ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3′-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3′-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). CONCLUSIONS: We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3′-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02502-z. |
format | Online Article Text |
id | pubmed-8518154 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-85181542021-10-20 Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation Shah, Ankeeta Mittleman, Briana E. Gilad, Yoav Li, Yang I. Genome Biol Research BACKGROUND: Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3′ ends. Most APA occurs within 3′ UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization. RESULTS: APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools—TAPAS, QAPA, DaPars2, GETUTR, and APATrap— against 3′-Seq, a specialized RNA-seq protocol that enriches for reads at the 3′ ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3′-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3′-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL). CONCLUSIONS: We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3′-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-021-02502-z. BioMed Central 2021-10-14 /pmc/articles/PMC8518154/ /pubmed/34649612 http://dx.doi.org/10.1186/s13059-021-02502-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Shah, Ankeeta Mittleman, Briana E. Gilad, Yoav Li, Yang I. Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title | Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title_full | Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title_fullStr | Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title_full_unstemmed | Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title_short | Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
title_sort | benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8518154/ https://www.ncbi.nlm.nih.gov/pubmed/34649612 http://dx.doi.org/10.1186/s13059-021-02502-z |
work_keys_str_mv | AT shahankeeta benchmarkingsequencingmethodsandtoolsthatfacilitatethestudyofalternativepolyadenylation AT mittlemanbrianae benchmarkingsequencingmethodsandtoolsthatfacilitatethestudyofalternativepolyadenylation AT giladyoav benchmarkingsequencingmethodsandtoolsthatfacilitatethestudyofalternativepolyadenylation AT liyangi benchmarkingsequencingmethodsandtoolsthatfacilitatethestudyofalternativepolyadenylation |