Cargando…
Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms
Genome-wide association studies (GWAS) are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a “black box” in order to promote changes in life-st...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3749941/ https://www.ncbi.nlm.nih.gov/pubmed/23990773 http://dx.doi.org/10.1371/journal.pcbi.1003200 |
_version_ | 1782477046686941184 |
---|---|
author | Manor, Ohad Segal, Eran |
author_facet | Manor, Ohad Segal, Eran |
author_sort | Manor, Ohad |
collection | PubMed |
description | Genome-wide association studies (GWAS) are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a “black box” in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs) by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC) data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF), suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment. |
format | Online Article Text |
id | pubmed-3749941 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37499412013-08-29 Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms Manor, Ohad Segal, Eran PLoS Comput Biol Research Article Genome-wide association studies (GWAS) are widely used to search for genetic loci that underlie human disease. Another goal is to predict disease risk for different individuals given their genetic sequence. Such predictions could either be used as a “black box” in order to promote changes in life-style and screening for early diagnosis, or as a model that can be studied to better understand the mechanism of the disease. Current methods for risk prediction typically rank single nucleotide polymorphisms (SNPs) by the p-value of their association with the disease, and use the top-associated SNPs as input to a classification algorithm. However, the predictive power of such methods is relatively poor. To improve the predictive power, we devised BootRank, which uses bootstrapping in order to obtain a robust prioritization of SNPs for use in predictive models. We show that BootRank improves the ability to predict disease risk of unseen individuals in the Wellcome Trust Case Control Consortium (WTCCC) data and results in a more robust set of SNPs and a larger number of enriched pathways being associated with the different diseases. Finally, we show that combining BootRank with seven different classification algorithms improves performance compared to previous studies that used the WTCCC data. Notably, diseases for which BootRank results in the largest improvements were recently shown to have more heritability than previously thought, likely due to contributions from variants with low minimum allele frequency (MAF), suggesting that BootRank can be beneficial in cases where SNPs affecting the disease are poorly tagged or have low MAF. Overall, our results show that improving disease risk prediction from genotypic information may be a tangible goal, with potential implications for personalized disease screening and treatment. Public Library of Science 2013-08-22 /pmc/articles/PMC3749941/ /pubmed/23990773 http://dx.doi.org/10.1371/journal.pcbi.1003200 Text en © 2013 Manor, Segal http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Manor, Ohad Segal, Eran Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title | Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title_full | Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title_fullStr | Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title_full_unstemmed | Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title_short | Predicting Disease Risk Using Bootstrap Ranking and Classification Algorithms |
title_sort | predicting disease risk using bootstrap ranking and classification algorithms |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3749941/ https://www.ncbi.nlm.nih.gov/pubmed/23990773 http://dx.doi.org/10.1371/journal.pcbi.1003200 |
work_keys_str_mv | AT manorohad predictingdiseaseriskusingbootstraprankingandclassificationalgorithms AT segaleran predictingdiseaseriskusingbootstraprankingandclassificationalgorithms |