Cargando…

Benchmarking splice variant prediction algorithms using massively parallel splicing assays

BACKGROUND: Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compou...

Descripción completa

Detalles Bibliográficos
Autores principales:	Smith, Cathy, Kitzman, Jacob O.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cold Spring Harbor Laboratory 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187268/ https://www.ncbi.nlm.nih.gov/pubmed/37205456 http://dx.doi.org/10.1101/2023.05.04.539398

_version_	1785042711409065984
author	Smith, Cathy Kitzman, Jacob O.
author_facet	Smith, Cathy Kitzman, Jacob O.
author_sort	Smith, Cathy
collection	PubMed
description	BACKGROUND: Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. RESULTS: We benchmarked eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compared experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms’ concordance with MPSA measurements, and with each other, was lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieved the best overall performance at distinguishing disruptive and neutral variants. Controlling for overall call rate genome-wide, SpliceAI and Pangolin also showed superior overall sensitivity for identifying SDVs. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. CONCLUSION: SpliceAI and Pangolin showed the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
format	Online Article Text
id	pubmed-10187268
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Cold Spring Harbor Laboratory
record_format	MEDLINE/PubMed
spelling	pubmed-101872682023-05-17 Benchmarking splice variant prediction algorithms using massively parallel splicing assays Smith, Cathy Kitzman, Jacob O. bioRxiv Article BACKGROUND: Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. RESULTS: We benchmarked eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compared experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms’ concordance with MPSA measurements, and with each other, was lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieved the best overall performance at distinguishing disruptive and neutral variants. Controlling for overall call rate genome-wide, SpliceAI and Pangolin also showed superior overall sensitivity for identifying SDVs. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. CONCLUSION: SpliceAI and Pangolin showed the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons. Cold Spring Harbor Laboratory 2023-05-07 /pmc/articles/PMC10187268/ /pubmed/37205456 http://dx.doi.org/10.1101/2023.05.04.539398 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle	Article Smith, Cathy Kitzman, Jacob O. Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title	Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title_full	Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title_fullStr	Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title_full_unstemmed	Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title_short	Benchmarking splice variant prediction algorithms using massively parallel splicing assays
title_sort	benchmarking splice variant prediction algorithms using massively parallel splicing assays
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10187268/ https://www.ncbi.nlm.nih.gov/pubmed/37205456 http://dx.doi.org/10.1101/2023.05.04.539398
work_keys_str_mv	AT smithcathy benchmarkingsplicevariantpredictionalgorithmsusingmassivelyparallelsplicingassays AT kitzmanjacobo benchmarkingsplicevariantpredictionalgorithmsusingmassivelyparallelsplicingassays

Benchmarking splice variant prediction algorithms using massively parallel splicing assays

Ejemplares similares