Cargando…

Evaluation of an ensemble of genetic models for prediction of a quantitative trait

Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to bu...

Descripción completa

Detalles Bibliográficos
Autores principales: Milton, Jacqueline N., Steinberg, Martin H., Sebastiani, Paola
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4292739/
https://www.ncbi.nlm.nih.gov/pubmed/25628649
http://dx.doi.org/10.3389/fgene.2014.00474
_version_ 1782352538393116672
author Milton, Jacqueline N.
Steinberg, Martin H.
Sebastiani, Paola
author_facet Milton, Jacqueline N.
Steinberg, Martin H.
Sebastiani, Paola
author_sort Milton, Jacqueline N.
collection PubMed
description Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble.
format Online
Article
Text
id pubmed-4292739
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-42927392015-01-27 Evaluation of an ensemble of genetic models for prediction of a quantitative trait Milton, Jacqueline N. Steinberg, Martin H. Sebastiani, Paola Front Genet Genetics Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble. Frontiers Media S.A. 2015-01-13 /pmc/articles/PMC4292739/ /pubmed/25628649 http://dx.doi.org/10.3389/fgene.2014.00474 Text en Copyright © 2015 Milton, Steinberg and Sebastiani. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Milton, Jacqueline N.
Steinberg, Martin H.
Sebastiani, Paola
Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title_full Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title_fullStr Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title_full_unstemmed Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title_short Evaluation of an ensemble of genetic models for prediction of a quantitative trait
title_sort evaluation of an ensemble of genetic models for prediction of a quantitative trait
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4292739/
https://www.ncbi.nlm.nih.gov/pubmed/25628649
http://dx.doi.org/10.3389/fgene.2014.00474
work_keys_str_mv AT miltonjacquelinen evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait
AT steinbergmartinh evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait
AT sebastianipaola evaluationofanensembleofgeneticmodelsforpredictionofaquantitativetrait