Cargando…
Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ
Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234309/ https://www.ncbi.nlm.nih.gov/pubmed/37117035 http://dx.doi.org/10.1101/gr.277177.122 |
_version_ | 1785052461201883136 |
---|---|
author | Long, Yongkang Zhang, Bin Tian, Shuye Chan, Jia Jia Zhou, Juexiao Li, Zhongxiao Li, Yisheng An, Zheng Liao, Xingyu Wang, Yu Sun, Shiwei Xu, Ying Tay, Yvonne Chen, Wei Gao, Xin |
author_facet | Long, Yongkang Zhang, Bin Tian, Shuye Chan, Jia Jia Zhou, Juexiao Li, Zhongxiao Li, Yisheng An, Zheng Liao, Xingyu Wang, Yu Sun, Shiwei Xu, Ying Tay, Yvonne Chen, Wei Gao, Xin |
author_sort | Long, Yongkang |
collection | PubMed |
description | Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered from high error rate on both polyadenylation site (PAS) identification and quantification of PAS usage (PAU), and bias toward 3′ untranslated regions. Here we developed a tool for APA identification and quantification (APAIQ) from RNA-seq data, which can accurately identify PAS and quantify PAU in a transcriptome-wide manner. Using 3′ end-seq data as the benchmark, we showed that APAIQ outperforms current methods on PAS identification and PAU quantification, including DaPars2, Aptardi, mountainClimber, SANPolyA, and QAPA. Finally, applying APAIQ on 421 RNA-seq samples from liver cancer patients, we identified >540 tumor-associated APA events and experimentally validated two intronic polyadenylation candidates, demonstrating its capacity to unveil cancer-related APA with a large-scale RNA-seq data set. |
format | Online Article Text |
id | pubmed-10234309 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-102343092023-06-02 Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ Long, Yongkang Zhang, Bin Tian, Shuye Chan, Jia Jia Zhou, Juexiao Li, Zhongxiao Li, Yisheng An, Zheng Liao, Xingyu Wang, Yu Sun, Shiwei Xu, Ying Tay, Yvonne Chen, Wei Gao, Xin Genome Res Methods Alternative polyadenylation (APA) enables a gene to generate multiple transcripts with different 3′ ends, which is dynamic across different cell types or conditions. Many computational methods have been developed to characterize sample-specific APA using the corresponding RNA-seq data, but suffered from high error rate on both polyadenylation site (PAS) identification and quantification of PAS usage (PAU), and bias toward 3′ untranslated regions. Here we developed a tool for APA identification and quantification (APAIQ) from RNA-seq data, which can accurately identify PAS and quantify PAU in a transcriptome-wide manner. Using 3′ end-seq data as the benchmark, we showed that APAIQ outperforms current methods on PAS identification and PAU quantification, including DaPars2, Aptardi, mountainClimber, SANPolyA, and QAPA. Finally, applying APAIQ on 421 RNA-seq samples from liver cancer patients, we identified >540 tumor-associated APA events and experimentally validated two intronic polyadenylation candidates, demonstrating its capacity to unveil cancer-related APA with a large-scale RNA-seq data set. Cold Spring Harbor Laboratory Press 2023-04 /pmc/articles/PMC10234309/ /pubmed/37117035 http://dx.doi.org/10.1101/gr.277177.122 Text en © 2023 Long et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Methods Long, Yongkang Zhang, Bin Tian, Shuye Chan, Jia Jia Zhou, Juexiao Li, Zhongxiao Li, Yisheng An, Zheng Liao, Xingyu Wang, Yu Sun, Shiwei Xu, Ying Tay, Yvonne Chen, Wei Gao, Xin Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title | Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title_full | Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title_fullStr | Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title_full_unstemmed | Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title_short | Accurate transcriptome-wide identification and quantification of alternative polyadenylation from RNA-seq data with APAIQ |
title_sort | accurate transcriptome-wide identification and quantification of alternative polyadenylation from rna-seq data with apaiq |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234309/ https://www.ncbi.nlm.nih.gov/pubmed/37117035 http://dx.doi.org/10.1101/gr.277177.122 |
work_keys_str_mv | AT longyongkang accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT zhangbin accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT tianshuye accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT chanjiajia accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT zhoujuexiao accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT lizhongxiao accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT liyisheng accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT anzheng accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT liaoxingyu accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT wangyu accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT sunshiwei accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT xuying accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT tayyvonne accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT chenwei accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq AT gaoxin accuratetranscriptomewideidentificationandquantificationofalternativepolyadenylationfromrnaseqdatawithapaiq |