Cargando…
Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies
BACKGROUND: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exon...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9664736/ https://www.ncbi.nlm.nih.gov/pubmed/36376793 http://dx.doi.org/10.1186/s12859-022-05041-x |
_version_ | 1784831162433142784 |
---|---|
author | Cormier, Michael J. Pedersen, Brent S. Bayrak-Toydemir, Pinar Quinlan, Aaron R. |
author_facet | Cormier, Michael J. Pedersen, Brent S. Bayrak-Toydemir, Pinar Quinlan, Aaron R. |
author_sort | Cormier, Michael J. |
collection | PubMed |
description | BACKGROUND: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exonic loci outside of canonical donor or acceptor splice sites. RESULTS: Several existing tools predict the likelihood that a genetic variant causes alternative splicing. We sought to extend such methods by developing a new metric that aids in discerning whether a genetic variant leads to deleterious alternative splicing. Our metric combines genetic variation in the Genome Aggregate Database with alternative splicing predictions from SpliceAI to compare observed and expected levels of splice-altering genetic variation. We infer genic regions with significantly less splice-altering variation than expected to be constrained. The resulting model of regional splicing constraint captures differential splicing constraint across gene and exon categories, and the most constrained genic regions are enriched for pathogenic splice-altering variants. Building from this model, we developed ConSpliceML. This ensemble machine learning approach combines regional splicing constraint with multiple per-nucleotide alternative splicing scores to guide the prediction of deleterious splicing variants in protein-coding genes. ConSpliceML more accurately distinguishes deleterious and benign splicing variants than state-of-the-art splicing prediction methods, especially in “cryptic” splicing regions beyond canonical donor or acceptor splice sites. CONCLUSION: Integrating a model of genetic constraint with annotations from existing alternative splicing tools allows ConSpliceML to prioritize potentially deleterious splice-altering variants in studies of rare human diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05041-x. |
format | Online Article Text |
id | pubmed-9664736 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-96647362022-11-15 Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies Cormier, Michael J. Pedersen, Brent S. Bayrak-Toydemir, Pinar Quinlan, Aaron R. BMC Bioinformatics Research BACKGROUND: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exonic loci outside of canonical donor or acceptor splice sites. RESULTS: Several existing tools predict the likelihood that a genetic variant causes alternative splicing. We sought to extend such methods by developing a new metric that aids in discerning whether a genetic variant leads to deleterious alternative splicing. Our metric combines genetic variation in the Genome Aggregate Database with alternative splicing predictions from SpliceAI to compare observed and expected levels of splice-altering genetic variation. We infer genic regions with significantly less splice-altering variation than expected to be constrained. The resulting model of regional splicing constraint captures differential splicing constraint across gene and exon categories, and the most constrained genic regions are enriched for pathogenic splice-altering variants. Building from this model, we developed ConSpliceML. This ensemble machine learning approach combines regional splicing constraint with multiple per-nucleotide alternative splicing scores to guide the prediction of deleterious splicing variants in protein-coding genes. ConSpliceML more accurately distinguishes deleterious and benign splicing variants than state-of-the-art splicing prediction methods, especially in “cryptic” splicing regions beyond canonical donor or acceptor splice sites. CONCLUSION: Integrating a model of genetic constraint with annotations from existing alternative splicing tools allows ConSpliceML to prioritize potentially deleterious splice-altering variants in studies of rare human diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05041-x. BioMed Central 2022-11-14 /pmc/articles/PMC9664736/ /pubmed/36376793 http://dx.doi.org/10.1186/s12859-022-05041-x Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Cormier, Michael J. Pedersen, Brent S. Bayrak-Toydemir, Pinar Quinlan, Aaron R. Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title | Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title_full | Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title_fullStr | Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title_full_unstemmed | Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title_short | Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
title_sort | combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9664736/ https://www.ncbi.nlm.nih.gov/pubmed/36376793 http://dx.doi.org/10.1186/s12859-022-05041-x |
work_keys_str_mv | AT cormiermichaelj combininggeneticconstraintwithpredictionsofalternativesplicingtoprioritizedeleterioussplicinginrarediseasestudies AT pedersenbrents combininggeneticconstraintwithpredictionsofalternativesplicingtoprioritizedeleterioussplicinginrarediseasestudies AT bayraktoydemirpinar combininggeneticconstraintwithpredictionsofalternativesplicingtoprioritizedeleterioussplicinginrarediseasestudies AT quinlanaaronr combininggeneticconstraintwithpredictionsofalternativesplicingtoprioritizedeleterioussplicinginrarediseasestudies |