Cargando…

A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data

Both molecular marker and gene expression data were considered alone as well as jointly to serve as additive predictors for two pathogen-activity-phenotypes in real recombinant inbred lines of soybean. For unobserved phenotype prediction, we used a Bayesian hierarchical regression modeling, where th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bhattacharjee, Madhuchhanda, Sillanpää, Mikko J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2011
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3210128/ https://www.ncbi.nlm.nih.gov/pubmed/22087238 http://dx.doi.org/10.1371/journal.pone.0026959

_version_	1782215711001673728
author	Bhattacharjee, Madhuchhanda Sillanpää, Mikko J.
author_facet	Bhattacharjee, Madhuchhanda Sillanpää, Mikko J.
author_sort	Bhattacharjee, Madhuchhanda
collection	PubMed
description	Both molecular marker and gene expression data were considered alone as well as jointly to serve as additive predictors for two pathogen-activity-phenotypes in real recombinant inbred lines of soybean. For unobserved phenotype prediction, we used a Bayesian hierarchical regression modeling, where the number of possible predictors in the model was controlled by different selection strategies tested. Our initial findings were submitted for DREAM5 (the 5th Dialogue on Reverse Engineering Assessment and Methods challenge) and were judged to be the best in sub-challenge B3 wherein both functional genomic and genetic data were used to predict the phenotypes. In this work we further improve upon this previous work by considering various predictor selection strategies and cross-validation was used to measure accuracy of in-data and out-data predictions. The results from various model choices indicate that for this data use of both data types (namely functional genomic and genetic) simultaneously improves out-data prediction accuracy. Adequate goodness-of-fit can be easily achieved with more complex models for both phenotypes, since the number of potential predictors is large and the sample size is not small. We also further studied gene-set enrichment (for continuous phenotype) in the biological process in question and chromosomal enrichment of the gene set. The methodological contribution of this paper is in exploration of variable selection techniques to alleviate the problem of over-fitting. Different strategies based on the nature of covariates were explored and all methods were implemented under the Bayesian hierarchical modeling framework with indicator-based covariate selection. All the models based in careful variable selection procedure were found to produce significant results based on permutation test.
format	Online Article Text
id	pubmed-3210128
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-32101282011-11-15 A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data Bhattacharjee, Madhuchhanda Sillanpää, Mikko J. PLoS One Research Article Both molecular marker and gene expression data were considered alone as well as jointly to serve as additive predictors for two pathogen-activity-phenotypes in real recombinant inbred lines of soybean. For unobserved phenotype prediction, we used a Bayesian hierarchical regression modeling, where the number of possible predictors in the model was controlled by different selection strategies tested. Our initial findings were submitted for DREAM5 (the 5th Dialogue on Reverse Engineering Assessment and Methods challenge) and were judged to be the best in sub-challenge B3 wherein both functional genomic and genetic data were used to predict the phenotypes. In this work we further improve upon this previous work by considering various predictor selection strategies and cross-validation was used to measure accuracy of in-data and out-data predictions. The results from various model choices indicate that for this data use of both data types (namely functional genomic and genetic) simultaneously improves out-data prediction accuracy. Adequate goodness-of-fit can be easily achieved with more complex models for both phenotypes, since the number of potential predictors is large and the sample size is not small. We also further studied gene-set enrichment (for continuous phenotype) in the biological process in question and chromosomal enrichment of the gene set. The methodological contribution of this paper is in exploration of variable selection techniques to alleviate the problem of over-fitting. Different strategies based on the nature of covariates were explored and all methods were implemented under the Bayesian hierarchical modeling framework with indicator-based covariate selection. All the models based in careful variable selection procedure were found to produce significant results based on permutation test. Public Library of Science 2011-11-07 /pmc/articles/PMC3210128/ /pubmed/22087238 http://dx.doi.org/10.1371/journal.pone.0026959 Text en Bhattacharjee, Sillanpää. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Bhattacharjee, Madhuchhanda Sillanpää, Mikko J. A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title	A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title_full	A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title_fullStr	A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title_full_unstemmed	A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title_short	A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data
title_sort	bayesian mixed regression based prediction of quantitative traits from molecular marker and gene expression data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3210128/ https://www.ncbi.nlm.nih.gov/pubmed/22087238 http://dx.doi.org/10.1371/journal.pone.0026959
work_keys_str_mv	AT bhattacharjeemadhuchhanda abayesianmixedregressionbasedpredictionofquantitativetraitsfrommolecularmarkerandgeneexpressiondata AT sillanpaamikkoj abayesianmixedregressionbasedpredictionofquantitativetraitsfrommolecularmarkerandgeneexpressiondata AT bhattacharjeemadhuchhanda bayesianmixedregressionbasedpredictionofquantitativetraitsfrommolecularmarkerandgeneexpressiondata AT sillanpaamikkoj bayesianmixedregressionbasedpredictionofquantitativetraitsfrommolecularmarkerandgeneexpressiondata

A Bayesian Mixed Regression Based Prediction of Quantitative Traits from Molecular Marker and Gene Expression Data

Ejemplares similares