Cargando…
PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data
More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3’ untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3’UTR length changes from bulk RNA-...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900750/ https://www.ncbi.nlm.nih.gov/pubmed/36747700 http://dx.doi.org/10.1101/2023.01.23.523471 |
_version_ | 1784882913077100544 |
---|---|
author | Jonnakuti, Venkata Soumith Wagner, Eric J. Maletić-Savatić, Mirjana Liu, Zhandong Yalamanchili, Hari Krishna |
author_facet | Jonnakuti, Venkata Soumith Wagner, Eric J. Maletić-Savatić, Mirjana Liu, Zhandong Yalamanchili, Hari Krishna |
author_sort | Jonnakuti, Venkata Soumith |
collection | PubMed |
description | More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3’ untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3’UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer’s Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data. |
format | Online Article Text |
id | pubmed-9900750 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-99007502023-02-07 PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data Jonnakuti, Venkata Soumith Wagner, Eric J. Maletić-Savatić, Mirjana Liu, Zhandong Yalamanchili, Hari Krishna bioRxiv Article More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3’ untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3’UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer’s Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data. Cold Spring Harbor Laboratory 2023-01-24 /pmc/articles/PMC9900750/ /pubmed/36747700 http://dx.doi.org/10.1101/2023.01.23.523471 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Jonnakuti, Venkata Soumith Wagner, Eric J. Maletić-Savatić, Mirjana Liu, Zhandong Yalamanchili, Hari Krishna PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title | PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title_full | PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title_fullStr | PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title_full_unstemmed | PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title_short | PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data |
title_sort | polyaminer-bulk: a machine learning based bioinformatics algorithm to infer and decode alternative polyadenylation dynamics from bulk rna-seq data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900750/ https://www.ncbi.nlm.nih.gov/pubmed/36747700 http://dx.doi.org/10.1101/2023.01.23.523471 |
work_keys_str_mv | AT jonnakutivenkatasoumith polyaminerbulkamachinelearningbasedbioinformaticsalgorithmtoinferanddecodealternativepolyadenylationdynamicsfrombulkrnaseqdata AT wagnerericj polyaminerbulkamachinelearningbasedbioinformaticsalgorithmtoinferanddecodealternativepolyadenylationdynamicsfrombulkrnaseqdata AT maleticsavaticmirjana polyaminerbulkamachinelearningbasedbioinformaticsalgorithmtoinferanddecodealternativepolyadenylationdynamicsfrombulkrnaseqdata AT liuzhandong polyaminerbulkamachinelearningbasedbioinformaticsalgorithmtoinferanddecodealternativepolyadenylationdynamicsfrombulkrnaseqdata AT yalamanchiliharikrishna polyaminerbulkamachinelearningbasedbioinformaticsalgorithmtoinferanddecodealternativepolyadenylationdynamicsfrombulkrnaseqdata |