Cargando…

PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing

BACKGROUND: Deep-intronic variants that alter RNA splicing were ineffectively evaluated in the search for the cause of genetic diseases. Determination of such pathogenic variants from a vast number of deep-intronic variants (approximately 1,500,000 variants per individual) represents a technical cha...

Descripción completa

Detalles Bibliográficos
Autores principales: Kurosawa, Ryo, Iida, Kei, Ajiro, Masahiko, Awaya, Tomonari, Yamada, Mamiko, Kosaki, Kenjiro, Hagiwara, Masatoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10563346/
https://www.ncbi.nlm.nih.gov/pubmed/37817060
http://dx.doi.org/10.1186/s12864-023-09645-2
_version_ 1785118321175166976
author Kurosawa, Ryo
Iida, Kei
Ajiro, Masahiko
Awaya, Tomonari
Yamada, Mamiko
Kosaki, Kenjiro
Hagiwara, Masatoshi
author_facet Kurosawa, Ryo
Iida, Kei
Ajiro, Masahiko
Awaya, Tomonari
Yamada, Mamiko
Kosaki, Kenjiro
Hagiwara, Masatoshi
author_sort Kurosawa, Ryo
collection PubMed
description BACKGROUND: Deep-intronic variants that alter RNA splicing were ineffectively evaluated in the search for the cause of genetic diseases. Determination of such pathogenic variants from a vast number of deep-intronic variants (approximately 1,500,000 variants per individual) represents a technical challenge to researchers. Thus, we developed a Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing (PDIVAS) to easily detect pathogenic deep-intronic variants. RESULTS: PDIVAS was trained on an ensemble machine-learning algorithm to classify pathogenic and benign variants in a curated dataset. The dataset consists of manually curated pathogenic splice-altering variants (SAVs) and commonly observed benign variants within deep introns. Splicing features and a splicing constraint metric were used to maximize the predictive sensitivity and specificity, respectively. PDIVAS showed an average precision of 0.92 and a maximum MCC of 0.88 in classifying these variants, which were the best of the previous predictors. When PDIVAS was applied to genome sequencing analysis on a threshold with 95% sensitivity for reported pathogenic SAVs, an average of 27 pathogenic candidates were extracted per individual. Furthermore, the causative variants in simulated patient genomes were more efficiently prioritized than the previous predictors. CONCLUSION: Incorporating PDIVAS into variant interpretation pipelines will enable efficient detection of disease-causing deep-intronic SAVs and contribute to improving the diagnostic yield. PDIVAS is publicly available at https://github.com/shiro-kur/PDIVAS. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-023-09645-2.
format Online
Article
Text
id pubmed-10563346
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105633462023-10-11 PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing Kurosawa, Ryo Iida, Kei Ajiro, Masahiko Awaya, Tomonari Yamada, Mamiko Kosaki, Kenjiro Hagiwara, Masatoshi BMC Genomics Research BACKGROUND: Deep-intronic variants that alter RNA splicing were ineffectively evaluated in the search for the cause of genetic diseases. Determination of such pathogenic variants from a vast number of deep-intronic variants (approximately 1,500,000 variants per individual) represents a technical challenge to researchers. Thus, we developed a Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing (PDIVAS) to easily detect pathogenic deep-intronic variants. RESULTS: PDIVAS was trained on an ensemble machine-learning algorithm to classify pathogenic and benign variants in a curated dataset. The dataset consists of manually curated pathogenic splice-altering variants (SAVs) and commonly observed benign variants within deep introns. Splicing features and a splicing constraint metric were used to maximize the predictive sensitivity and specificity, respectively. PDIVAS showed an average precision of 0.92 and a maximum MCC of 0.88 in classifying these variants, which were the best of the previous predictors. When PDIVAS was applied to genome sequencing analysis on a threshold with 95% sensitivity for reported pathogenic SAVs, an average of 27 pathogenic candidates were extracted per individual. Furthermore, the causative variants in simulated patient genomes were more efficiently prioritized than the previous predictors. CONCLUSION: Incorporating PDIVAS into variant interpretation pipelines will enable efficient detection of disease-causing deep-intronic SAVs and contribute to improving the diagnostic yield. PDIVAS is publicly available at https://github.com/shiro-kur/PDIVAS. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-023-09645-2. BioMed Central 2023-10-10 /pmc/articles/PMC10563346/ /pubmed/37817060 http://dx.doi.org/10.1186/s12864-023-09645-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Kurosawa, Ryo
Iida, Kei
Ajiro, Masahiko
Awaya, Tomonari
Yamada, Mamiko
Kosaki, Kenjiro
Hagiwara, Masatoshi
PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title_full PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title_fullStr PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title_full_unstemmed PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title_short PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing
title_sort pdivas: pathogenicity predictor for deep-intronic variants causing aberrant splicing
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10563346/
https://www.ncbi.nlm.nih.gov/pubmed/37817060
http://dx.doi.org/10.1186/s12864-023-09645-2
work_keys_str_mv AT kurosawaryo pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT iidakei pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT ajiromasahiko pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT awayatomonari pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT yamadamamiko pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT kosakikenjiro pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing
AT hagiwaramasatoshi pdivaspathogenicitypredictorfordeepintronicvariantscausingaberrantsplicing