Cargando…

Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors

A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing...

Descripción completa

Detalles Bibliográficos
Autores principales: van den Berg, Bastiaan A., Reinders, Marcel J. T., de Ridder, Dick, de Beer, Tjaart A. P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4380319/
https://www.ncbi.nlm.nih.gov/pubmed/25826299
http://dx.doi.org/10.1371/journal.pone.0120729
_version_ 1782364313061687296
author van den Berg, Bastiaan A.
Reinders, Marcel J. T.
de Ridder, Dick
de Beer, Tjaart A. P.
author_facet van den Berg, Bastiaan A.
Reinders, Marcel J. T.
de Ridder, Dick
de Beer, Tjaart A. P.
author_sort van den Berg, Bastiaan A.
collection PubMed
description A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing technologies. The high performances of current sequence-based predictors indicate that sequence data contains valuable information about a variant being neutral or disease-associated. However, most predictors do not readily disclose this information, and so it remains unclear what sequence properties are most important. Here, we show how we can obtain insight into sequence characteristics of variants and their surroundings by interpreting predictors. We used an extensive range of features derived from the variant itself, its surrounding sequence, sequence conservation, and sequence annotation, and employed linear support vector machine classifiers to enable extracting feature importance from trained predictors. Our approach is useful for providing additional information about what features are most important for the predictions made. Furthermore, for large sets of known variants, it can provide insight into the mechanisms responsible for variants being disease-associated.
format Online
Article
Text
id pubmed-4380319
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43803192015-04-09 Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors van den Berg, Bastiaan A. Reinders, Marcel J. T. de Ridder, Dick de Beer, Tjaart A. P. PLoS One Research Article A variety of methods that predict human nonsynonymous single nucleotide polymorphisms (SNPs) to be neutral or disease-associated have been developed over the last decade. These methods are used for pinpointing disease-associated variants in the many variants obtained with next-generation sequencing technologies. The high performances of current sequence-based predictors indicate that sequence data contains valuable information about a variant being neutral or disease-associated. However, most predictors do not readily disclose this information, and so it remains unclear what sequence properties are most important. Here, we show how we can obtain insight into sequence characteristics of variants and their surroundings by interpreting predictors. We used an extensive range of features derived from the variant itself, its surrounding sequence, sequence conservation, and sequence annotation, and employed linear support vector machine classifiers to enable extracting feature importance from trained predictors. Our approach is useful for providing additional information about what features are most important for the predictions made. Furthermore, for large sets of known variants, it can provide insight into the mechanisms responsible for variants being disease-associated. Public Library of Science 2015-03-31 /pmc/articles/PMC4380319/ /pubmed/25826299 http://dx.doi.org/10.1371/journal.pone.0120729 Text en © 2015 van den Berg et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
van den Berg, Bastiaan A.
Reinders, Marcel J. T.
de Ridder, Dick
de Beer, Tjaart A. P.
Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title_full Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title_fullStr Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title_full_unstemmed Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title_short Insight into Neutral and Disease-Associated Human Genetic Variants through Interpretable Predictors
title_sort insight into neutral and disease-associated human genetic variants through interpretable predictors
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4380319/
https://www.ncbi.nlm.nih.gov/pubmed/25826299
http://dx.doi.org/10.1371/journal.pone.0120729
work_keys_str_mv AT vandenbergbastiaana insightintoneutralanddiseaseassociatedhumangeneticvariantsthroughinterpretablepredictors
AT reindersmarceljt insightintoneutralanddiseaseassociatedhumangeneticvariantsthroughinterpretablepredictors
AT deridderdick insightintoneutralanddiseaseassociatedhumangeneticvariantsthroughinterpretablepredictors
AT debeertjaartap insightintoneutralanddiseaseassociatedhumangeneticvariantsthroughinterpretablepredictors