Cargando…
APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data
BACKGROUND: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform f...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9520800/ https://www.ncbi.nlm.nih.gov/pubmed/36171568 http://dx.doi.org/10.1186/s12859-022-04939-w |
_version_ | 1784799707077279744 |
---|---|
author | Fahmi, Naima Ahmed Ahmed, Khandakar Tanvir Chang, Jae-Woong Nassereddeen, Heba Fan, Deliang Yong, Jeongsik Zhang, Wei |
author_facet | Fahmi, Naima Ahmed Ahmed, Khandakar Tanvir Chang, Jae-Woong Nassereddeen, Heba Fan, Deliang Yong, Jeongsik Zhang, Wei |
author_sort | Fahmi, Naima Ahmed |
collection | PubMed |
description | BACKGROUND: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. METHODS: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan. RESULT: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. CONCLUSION: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04939-w. |
format | Online Article Text |
id | pubmed-9520800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-95208002022-09-30 APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data Fahmi, Naima Ahmed Ahmed, Khandakar Tanvir Chang, Jae-Woong Nassereddeen, Heba Fan, Deliang Yong, Jeongsik Zhang, Wei BMC Bioinformatics Research BACKGROUND: The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations. METHODS: APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan. RESULT: APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. CONCLUSION: APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04939-w. BioMed Central 2022-09-28 /pmc/articles/PMC9520800/ /pubmed/36171568 http://dx.doi.org/10.1186/s12859-022-04939-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Fahmi, Naima Ahmed Ahmed, Khandakar Tanvir Chang, Jae-Woong Nassereddeen, Heba Fan, Deliang Yong, Jeongsik Zhang, Wei APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title | APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title_full | APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title_fullStr | APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title_full_unstemmed | APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title_short | APA-Scan: detection and visualization of 3′-UTR alternative polyadenylation with RNA-seq and 3′-end-seq data |
title_sort | apa-scan: detection and visualization of 3′-utr alternative polyadenylation with rna-seq and 3′-end-seq data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9520800/ https://www.ncbi.nlm.nih.gov/pubmed/36171568 http://dx.doi.org/10.1186/s12859-022-04939-w |
work_keys_str_mv | AT fahminaimaahmed apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT ahmedkhandakartanvir apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT changjaewoong apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT nassereddeenheba apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT fandeliang apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT yongjeongsik apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata AT zhangwei apascandetectionandvisualizationof3utralternativepolyadenylationwithrnaseqand3endseqdata |