Cargando…
Deciphering the impact of genetic variation on human polyadenylation using APARENT2
BACKGROUND: 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the co...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9636789/ https://www.ncbi.nlm.nih.gov/pubmed/36335397 http://dx.doi.org/10.1186/s13059-022-02799-4 |
_version_ | 1784825030611304448 |
---|---|
author | Linder, Johannes Koplik, Samantha E. Kundaje, Anshul Seelig, Georg |
author_facet | Linder, Johannes Koplik, Samantha E. Kundaje, Anshul Seelig, Georg |
author_sort | Linder, Johannes |
collection | PubMed |
description | BACKGROUND: 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. RESULTS: We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of [Formula: see text] million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. CONCLUSIONS: A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02799-4. |
format | Online Article Text |
id | pubmed-9636789 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-96367892022-11-06 Deciphering the impact of genetic variation on human polyadenylation using APARENT2 Linder, Johannes Koplik, Samantha E. Kundaje, Anshul Seelig, Georg Genome Biol Research BACKGROUND: 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. RESULTS: We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of [Formula: see text] million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. CONCLUSIONS: A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-022-02799-4. BioMed Central 2022-11-05 /pmc/articles/PMC9636789/ /pubmed/36335397 http://dx.doi.org/10.1186/s13059-022-02799-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Linder, Johannes Koplik, Samantha E. Kundaje, Anshul Seelig, Georg Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title | Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title_full | Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title_fullStr | Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title_full_unstemmed | Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title_short | Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
title_sort | deciphering the impact of genetic variation on human polyadenylation using aparent2 |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9636789/ https://www.ncbi.nlm.nih.gov/pubmed/36335397 http://dx.doi.org/10.1186/s13059-022-02799-4 |
work_keys_str_mv | AT linderjohannes decipheringtheimpactofgeneticvariationonhumanpolyadenylationusingaparent2 AT kopliksamanthae decipheringtheimpactofgeneticvariationonhumanpolyadenylationusingaparent2 AT kundajeanshul decipheringtheimpactofgeneticvariationonhumanpolyadenylationusingaparent2 AT seeliggeorg decipheringtheimpactofgeneticvariationonhumanpolyadenylationusingaparent2 |