Cargando…

Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties

Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associ...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Yuliang, Liu, Diwei, Deng, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5470696/
https://www.ncbi.nlm.nih.gov/pubmed/28614374
http://dx.doi.org/10.1371/journal.pone.0179314
_version_ 1783243810730409984
author Pan, Yuliang
Liu, Diwei
Deng, Lei
author_facet Pan, Yuliang
Liu, Diwei
Deng, Lei
author_sort Pan, Yuliang
collection PubMed
description Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by incorporating gradient tree boosting (GTB) algorithm and optimally selected neighborhood features. A two-step feature selection approach is used to explore the most relevant and informative neighborhood properties that contribute to the prediction of disease association of SAVs across a wide range of sequence and structural features, especially some novel structural neighborhood features. In cross-validation experiments on the benchmark dataset, PredSAV achieves promising performances with an AUC score of 0.908 and a specificity of 0.838, which are significantly better than that of the other existing methods. Furthermore, we validate the capability of our proposed method by an independent test and gain a competitive advantage as a result. PredSAV, which combines gradient tree boosting with optimally selected neighborhood features, can return reliable predictions in distinguishing between disease-associated and neutral variants. Compared with existing methods, PredSAV shows improved specificity as well as increased overall performance.
format Online
Article
Text
id pubmed-5470696
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54706962017-07-03 Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties Pan, Yuliang Liu, Diwei Deng, Lei PLoS One Research Article Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by incorporating gradient tree boosting (GTB) algorithm and optimally selected neighborhood features. A two-step feature selection approach is used to explore the most relevant and informative neighborhood properties that contribute to the prediction of disease association of SAVs across a wide range of sequence and structural features, especially some novel structural neighborhood features. In cross-validation experiments on the benchmark dataset, PredSAV achieves promising performances with an AUC score of 0.908 and a specificity of 0.838, which are significantly better than that of the other existing methods. Furthermore, we validate the capability of our proposed method by an independent test and gain a competitive advantage as a result. PredSAV, which combines gradient tree boosting with optimally selected neighborhood features, can return reliable predictions in distinguishing between disease-associated and neutral variants. Compared with existing methods, PredSAV shows improved specificity as well as increased overall performance. Public Library of Science 2017-06-14 /pmc/articles/PMC5470696/ /pubmed/28614374 http://dx.doi.org/10.1371/journal.pone.0179314 Text en © 2017 Pan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Pan, Yuliang
Liu, Diwei
Deng, Lei
Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title_full Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title_fullStr Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title_full_unstemmed Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title_short Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
title_sort accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5470696/
https://www.ncbi.nlm.nih.gov/pubmed/28614374
http://dx.doi.org/10.1371/journal.pone.0179314
work_keys_str_mv AT panyuliang accuratepredictionoffunctionaleffectsforvariantsbycombininggradienttreeboostingwithoptimalneighborhoodproperties
AT liudiwei accuratepredictionoffunctionaleffectsforvariantsbycombininggradienttreeboostingwithoptimalneighborhoodproperties
AT denglei accuratepredictionoffunctionaleffectsforvariantsbycombininggradienttreeboostingwithoptimalneighborhoodproperties