Cargando…
ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations
To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933770/ https://www.ncbi.nlm.nih.gov/pubmed/29723276 http://dx.doi.org/10.1371/journal.pone.0196849 |
_version_ | 1783320006906347520 |
---|---|
author | Zhou, Hongyi Gao, Mu Skolnick, Jeffrey |
author_facet | Zhou, Hongyi Gao, Mu Skolnick, Jeffrey |
author_sort | Zhou, Hongyi |
collection | PubMed |
description | To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for the detection of disease-associated genes or cancer drivers can only identify common variations or driver genes in a cohort of patients. Thus, they cannot discover unique disease-associated mutations or cancer driver genes on a personal basis. Moreover, even when there are such common variations, their significance is unknown. Here, we extend the machine learning based approach ENTPRISE developed for predicting the disease association of missense mutations to frameshift and nonsense mutations. The new approach, ENTPRISE-X, is shown to outperform the state-of-the-art methods VEST-indel and DDIG-in for predicting the disease association of germline frameshift mutations in terms of balanced measure Matthew’s correlation coefficient, MCC, with a MCC of 0.586 for ENTPRISE-X, versus 0.412 by VEST-indel and 0.321 by DDIG-in, respectively. Large scale testing on the ExAC dataset shows ENTPRISE-X has a much lower fraction of 16% of variations classified as disease causing, as compared to VEST-indel’s 26% and DDIG-in’s 65% of predictions as being disease-associated. A web server for ENTPRISE-X is freely available for academic users at http://cssb2.biology.gatech.edu/entprise-x. |
format | Online Article Text |
id | pubmed-5933770 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-59337702018-05-18 ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations Zhou, Hongyi Gao, Mu Skolnick, Jeffrey PLoS One Research Article To exploit the plethora of information provided by Next Generation Sequencing, the identification of the genetic mutations responsible for disease in general or cancer in particular, among the thousands of neutral germline or somatic variations is a crucial task. Genome-wide association studies for the detection of disease-associated genes or cancer drivers can only identify common variations or driver genes in a cohort of patients. Thus, they cannot discover unique disease-associated mutations or cancer driver genes on a personal basis. Moreover, even when there are such common variations, their significance is unknown. Here, we extend the machine learning based approach ENTPRISE developed for predicting the disease association of missense mutations to frameshift and nonsense mutations. The new approach, ENTPRISE-X, is shown to outperform the state-of-the-art methods VEST-indel and DDIG-in for predicting the disease association of germline frameshift mutations in terms of balanced measure Matthew’s correlation coefficient, MCC, with a MCC of 0.586 for ENTPRISE-X, versus 0.412 by VEST-indel and 0.321 by DDIG-in, respectively. Large scale testing on the ExAC dataset shows ENTPRISE-X has a much lower fraction of 16% of variations classified as disease causing, as compared to VEST-indel’s 26% and DDIG-in’s 65% of predictions as being disease-associated. A web server for ENTPRISE-X is freely available for academic users at http://cssb2.biology.gatech.edu/entprise-x. Public Library of Science 2018-05-03 /pmc/articles/PMC5933770/ /pubmed/29723276 http://dx.doi.org/10.1371/journal.pone.0196849 Text en © 2018 Zhou et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Zhou, Hongyi Gao, Mu Skolnick, Jeffrey ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title | ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title_full | ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title_fullStr | ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title_full_unstemmed | ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title_short | ENTPRISE-X: Predicting disease-associated frameshift and nonsense mutations |
title_sort | entprise-x: predicting disease-associated frameshift and nonsense mutations |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933770/ https://www.ncbi.nlm.nih.gov/pubmed/29723276 http://dx.doi.org/10.1371/journal.pone.0196849 |
work_keys_str_mv | AT zhouhongyi entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations AT gaomu entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations AT skolnickjeffrey entprisexpredictingdiseaseassociatedframeshiftandnonsensemutations |