Cargando…

ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations

To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Hongyi, Gao, Mu, Skolnick, Jeffrey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933770/
https://www.ncbi.nlm.nih.gov/pubmed/29723276
http://dx.doi.org/10.1371/journal.pone.0196849
_version_ 1783320006906347520
author Zhou, Hongyi
Gao, Mu
Skolnick, Jeffrey
author_facet Zhou, Hongyi
Gao, Mu
Skolnick, Jeffrey
author_sort Zhou, Hongyi
collection PubMed
description To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for the detection of disease-associated genes or cancer drivers can only identify common variations or driver genes in a cohort of patients. Thus, they cannot discover unique disease-associated mutations or cancer driver genes on a personal basis. Moreover, even when there are such common variations, their significance is unknown. Here, we extend the machine learning based approach ENTPRISE developed for predicting the disease association of missense mutations to frameshift and nonsense mutations. The new approach, ENTPRISE-X, is shown to outperform the state-of-the-art methods VEST-indel and DDIG-in for predicting the disease association of germline frameshift mutations in terms of balanced measure Matthew’s correlation coefficient, MCC, with a MCC of 0.586 for ENTPRISE-X, versus 0.412 by VEST-indel and 0.321 by DDIG-in, respectively. Large scale testing on the ExAC dataset shows ENTPRISE-X has a much lower fraction of 16% of variations classified as disease causing, as compared to VEST-indel’s 26% and DDIG-in’s 65% of predictions as being disease-associated. A web server for ENTPRISE-X is freely available for academic users at http://cssb2.biology.gatech.edu/entprise-x.
format Online
Article
Text
id pubmed-5933770
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-59337702018-05-18 ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations Zhou, Hongyi Gao, Mu Skolnick, Jeffrey PLoS One Research Article To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for the detection of disease-associated genes or cancer drivers can only identify common variations or driver genes in a cohort of patients. Thus, they cannot discover unique disease-associated mutations or cancer driver genes on a personal basis. Moreover, even when there are such common variations, their significance is unknown. Here, we extend the machine learning based approach ENTPRISE developed for predicting the disease association of missense mutations to frameshift and nonsense mutations. The new approach, ENTPRISE-X, is shown to outperform the state-of-the-art methods VEST-indel and DDIG-in for predicting the disease association of germline frameshift mutations in terms of balanced measure Matthew’s correlation coefficient, MCC, with a MCC of 0.586 for ENTPRISE-X, versus 0.412 by VEST-indel and 0.321 by DDIG-in, respectively. Large scale testing on the ExAC dataset shows ENTPRISE-X has a much lower fraction of 16% of variations classified as disease causing, as compared to VEST-indel’s 26% and DDIG-in’s 65% of predictions as being disease-associated. A web server for ENTPRISE-X is freely available for academic users at http://cssb2.biology.gatech.edu/entprise-x. Public Library of Science 2018-05-03 /pmc/articles/PMC5933770/ /pubmed/29723276 http://dx.doi.org/10.1371/journal.pone.0196849 Text en © 2018 Zhou et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zhou, Hongyi
Gao, Mu
Skolnick, Jeffrey
ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title_full ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title_fullStr ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title_full_unstemmed ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title_short ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
title_sort entprise-x: predicting disease-associated frameshift and nonsense mutations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933770/
https://www.ncbi.nlm.nih.gov/pubmed/29723276
http://dx.doi.org/10.1371/journal.pone.0196849
work_keys_str_mv AT zhouhongyi entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations
AT gaomu entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations
AT skolnickjeffrey entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations