Cargando…

SVFX: a machine learning framework to quantify the pathogenicity of structural variants

There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we g...

Descripción completa

Detalles Bibliográficos
Autores principales: Kumar, Sushant, Harmanci, Arif, Vytheeswaran, Jagath, Gerstein, Mark B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7650198/
https://www.ncbi.nlm.nih.gov/pubmed/33168059
http://dx.doi.org/10.1186/s13059-020-02178-x
_version_ 1783607468109070336
author Kumar, Sushant
Harmanci, Arif
Vytheeswaran, Jagath
Gerstein, Mark B.
author_facet Kumar, Sushant
Harmanci, Arif
Vytheeswaran, Jagath
Gerstein, Mark B.
author_sort Kumar, Sushant
collection PubMed
description There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways. SUPPLEMENTARY INFORMATION: Supplementary information accompanies this paper at 10.1186/s13059-020-02178-x.
format Online
Article
Text
id pubmed-7650198
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76501982020-11-09 SVFX: a machine learning framework to quantify the pathogenicity of structural variants Kumar, Sushant Harmanci, Arif Vytheeswaran, Jagath Gerstein, Mark B. Genome Biol Method There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways. SUPPLEMENTARY INFORMATION: Supplementary information accompanies this paper at 10.1186/s13059-020-02178-x. BioMed Central 2020-11-09 /pmc/articles/PMC7650198/ /pubmed/33168059 http://dx.doi.org/10.1186/s13059-020-02178-x Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Method
Kumar, Sushant
Harmanci, Arif
Vytheeswaran, Jagath
Gerstein, Mark B.
SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title_full SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title_fullStr SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title_full_unstemmed SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title_short SVFX: a machine learning framework to quantify the pathogenicity of structural variants
title_sort svfx: a machine learning framework to quantify the pathogenicity of structural variants
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7650198/
https://www.ncbi.nlm.nih.gov/pubmed/33168059
http://dx.doi.org/10.1186/s13059-020-02178-x
work_keys_str_mv AT kumarsushant svfxamachinelearningframeworktoquantifythepathogenicityofstructuralvariants
AT harmanciarif svfxamachinelearningframeworktoquantifythepathogenicityofstructuralvariants
AT vytheeswaranjagath svfxamachinelearningframeworktoquantifythepathogenicityofstructuralvariants
AT gersteinmarkb svfxamachinelearningframeworktoquantifythepathogenicityofstructuralvariants