Cargando…
Evaluation of an ensemble of genetic models for prediction of a quantitative trait
Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to bu...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4292739/ https://www.ncbi.nlm.nih.gov/pubmed/25628649 http://dx.doi.org/10.3389/fgene.2014.00474 |
_version_ | 1782352538393116672 |
---|---|
author | Milton, Jacqueline N. Steinberg, Martin H. Sebastiani, Paola |
author_facet | Milton, Jacqueline N. Steinberg, Martin H. Sebastiani, Paola |
author_sort | Milton, Jacqueline N. |
collection | PubMed |
description | Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble. |
format | Online Article Text |
id | pubmed-4292739 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-42927392015-01-27 Evaluation of an ensemble of genetic models for prediction of a quantitative trait Milton, Jacqueline N. Steinberg, Martin H. Sebastiani, Paola Front Genet Genetics Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble. Frontiers Media S.A. 2015-01-13 /pmc/articles/PMC4292739/ /pubmed/25628649 http://dx.doi.org/10.3389/fgene.2014.00474 Text en Copyright © 2015 Milton, Steinberg and Sebastiani. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Milton, Jacqueline N. Steinberg, Martin H. Sebastiani, Paola Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title | Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title_full | Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title_fullStr | Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title_full_unstemmed | Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title_short | Evaluation of an ensemble of genetic models for prediction of a quantitative trait |
title_sort | evaluation of an ensemble of genetic models for prediction of a quantitative trait |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4292739/ https://www.ncbi.nlm.nih.gov/pubmed/25628649 http://dx.doi.org/10.3389/fgene.2014.00474 |
work_keys_str_mv | AT miltonjacquelinen evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait AT steinbergmartinh evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait AT sebastianipaola evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait |