Cargando…

PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants

Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Zhuoran, Li, Quan, Marchionni, Luigi, Wang, Kai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10684511/
https://www.ncbi.nlm.nih.gov/pubmed/38016949
http://dx.doi.org/10.1038/s41467-023-43651-y
_version_ 1785151416263770112
author Xu, Zhuoran
Li, Quan
Marchionni, Luigi
Wang, Kai
author_facet Xu, Zhuoran
Li, Quan
Marchionni, Luigi
Wang, Kai
author_sort Xu, Zhuoran
collection PubMed
description Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machine-learning model that interprets all major types of SVs and genes affected. PhenoSV segments and annotates SVs with diverse genomic features and employs a transformer-based architecture to predict their impacts under a multiple-instance learning framework. With phenotype information, PhenoSV further utilizes gene-phenotype associations to prioritize phenotype-related SVs. Evaluation on extensive human SV datasets covering all SV types demonstrates PhenoSV’s superior performance over competing methods. Applications in diseases suggest that PhenoSV can determine disease-related genes from SVs. A web server and a command-line tool for PhenoSV are available at https://phenosv.wglab.org.
format Online
Article
Text
id pubmed-10684511
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106845112023-11-30 PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants Xu, Zhuoran Li, Quan Marchionni, Luigi Wang, Kai Nat Commun Article Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machine-learning model that interprets all major types of SVs and genes affected. PhenoSV segments and annotates SVs with diverse genomic features and employs a transformer-based architecture to predict their impacts under a multiple-instance learning framework. With phenotype information, PhenoSV further utilizes gene-phenotype associations to prioritize phenotype-related SVs. Evaluation on extensive human SV datasets covering all SV types demonstrates PhenoSV’s superior performance over competing methods. Applications in diseases suggest that PhenoSV can determine disease-related genes from SVs. A web server and a command-line tool for PhenoSV are available at https://phenosv.wglab.org. Nature Publishing Group UK 2023-11-28 /pmc/articles/PMC10684511/ /pubmed/38016949 http://dx.doi.org/10.1038/s41467-023-43651-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Xu, Zhuoran
Li, Quan
Marchionni, Luigi
Wang, Kai
PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title_full PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title_fullStr PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title_full_unstemmed PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title_short PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
title_sort phenosv: interpretable phenotype-aware model for the prioritization of genes affected by structural variants
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10684511/
https://www.ncbi.nlm.nih.gov/pubmed/38016949
http://dx.doi.org/10.1038/s41467-023-43651-y
work_keys_str_mv AT xuzhuoran phenosvinterpretablephenotypeawaremodelfortheprioritizationofgenesaffectedbystructuralvariants
AT liquan phenosvinterpretablephenotypeawaremodelfortheprioritizationofgenesaffectedbystructuralvariants
AT marchionniluigi phenosvinterpretablephenotypeawaremodelfortheprioritizationofgenesaffectedbystructuralvariants
AT wangkai phenosvinterpretablephenotypeawaremodelfortheprioritizationofgenesaffectedbystructuralvariants