Cargando…

Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ

Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered...

Descripción completa

Detalles Bibliográficos
Autores principales: Long, Yongkang, Zhang, Bin, Tian, Shuye, Chan, Jia Jia, Zhou, Juexiao, Li, Zhongxiao, Li, Yisheng, An, Zheng, Liao, Xingyu, Wang, Yu, Sun, Shiwei, Xu, Ying, Tay, Yvonne, Chen, Wei, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234309/
https://www.ncbi.nlm.nih.gov/pubmed/37117035
http://dx.doi.org/10.1101/gr.277177.122
_version_ 1785052461201883136
author Long, Yongkang
Zhang, Bin
Tian, Shuye
Chan, Jia Jia
Zhou, Juexiao
Li, Zhongxiao
Li, Yisheng
An, Zheng
Liao, Xingyu
Wang, Yu
Sun, Shiwei
Xu, Ying
Tay, Yvonne
Chen, Wei
Gao, Xin
author_facet Long, Yongkang
Zhang, Bin
Tian, Shuye
Chan, Jia Jia
Zhou, Juexiao
Li, Zhongxiao
Li, Yisheng
An, Zheng
Liao, Xingyu
Wang, Yu
Sun, Shiwei
Xu, Ying
Tay, Yvonne
Chen, Wei
Gao, Xin
author_sort Long, Yongkang
collection PubMed
description Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered from high error rate on both polyadenylation site (PAS) identification and quantification of PAS usage (PAU), and bias toward 3′ untranslated regions. Here we developed a tool for APA identification and quantification (APAIQ) from RNA-seq data, which can accurately identify PAS and quantify PAU in a transcriptome-wide manner. Using 3′ end-seq data as the benchmark, we showed that APAIQ outperforms current methods on PAS identification and PAU quantification, including DaPars2, Aptardi, mountainClimber, SANPolyA, and QAPA. Finally, applying APAIQ on 421 RNA-seq samples from liver cancer patients, we identified >540 tumor-associated APA events and experimentally validated two intronic polyadenylation candidates, demonstrating its capacity to unveil cancer-related APA with a large-scale RNA-seq data set.
format Online
Article
Text
id pubmed-10234309
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-102343092023-06-02 Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ Long, Yongkang Zhang, Bin Tian, Shuye Chan, Jia Jia Zhou, Juexiao Li, Zhongxiao Li, Yisheng An, Zheng Liao, Xingyu Wang, Yu Sun, Shiwei Xu, Ying Tay, Yvonne Chen, Wei Gao, Xin Genome Res Methods Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered from high error rate on both polyadenylation site (PAS) identification and quantification of PAS usage (PAU), and bias toward 3′ untranslated regions. Here we developed a tool for APA identification and quantification (APAIQ) from RNA-seq data, which can accurately identify PAS and quantify PAU in a transcriptome-wide manner. Using 3′ end-seq data as the benchmark, we showed that APAIQ outperforms current methods on PAS identification and PAU quantification, including DaPars2, Aptardi, mountainClimber, SANPolyA, and QAPA. Finally, applying APAIQ on 421 RNA-seq samples from liver cancer patients, we identified >540 tumor-associated APA events and experimentally validated two intronic polyadenylation candidates, demonstrating its capacity to unveil cancer-related APA with a large-scale RNA-seq data set. Cold Spring Harbor Laboratory Press 2023-04 /pmc/articles/PMC10234309/ /pubmed/37117035 http://dx.doi.org/10.1101/gr.277177.122 Text en © 2023 Long et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Methods
Long, Yongkang
Zhang, Bin
Tian, Shuye
Chan, Jia Jia
Zhou, Juexiao
Li, Zhongxiao
Li, Yisheng
An, Zheng
Liao, Xingyu
Wang, Yu
Sun, Shiwei
Xu, Ying
Tay, Yvonne
Chen, Wei
Gao, Xin
Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title_full Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title_fullStr Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title_full_unstemmed Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title_short Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
title_sort accurate transcriptome-wide identification and quantification of alternative polyadenylation from rna-seq data with apaiq
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234309/
https://www.ncbi.nlm.nih.gov/pubmed/37117035
http://dx.doi.org/10.1101/gr.277177.122
work_keys_str_mv AT longyongkang accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT zhangbin accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT tianshuye accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT chanjiajia accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT zhoujuexiao accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT lizhongxiao accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT liyisheng accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT anzheng accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT liaoxingyu accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT wangyu accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT sunshiwei accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT xuying accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT tayyvonne accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT chenwei accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq
AT gaoxin accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq