Cargando…
Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data
Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10413136/ https://www.ncbi.nlm.nih.gov/pubmed/37576186 http://dx.doi.org/10.1016/j.xhgg.2023.100223 |
_version_ | 1785087070669111296 |
---|---|
author | Jung, Sunwoo Lee, Cue Hyunkyu Sul, Jae Hoon Han, Buhm |
author_facet | Jung, Sunwoo Lee, Cue Hyunkyu Sul, Jae Hoon Han, Buhm |
author_sort | Jung, Sunwoo |
collection | PubMed |
description | Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data, while the second category uses whole-blood expression data. Both data types can be easily collected from blood, avoiding invasive tissue biopsies. In this study, we attempted to build an optimal predictive model for imputing tissue-specific gene expression by combining the genotype and whole-blood expression data. We first evaluated the imputation performance of each standalone model (using genotype data [GEN model] and using whole-blood expression data [WBE model]) using their respective data types across 47 human tissues. The WBE model outperformed the GEN model in most tissues by a large gain. Then, we developed several combined models that leverage both types of predictors to further improve imputation performance. We tried various strategies, including utilizing a merged dataset of the two data types (MERGED models) and integrating the imputation outcomes of the two standalone models (inverse variance-weighted [IVW] models). We found that one of the MERGED models noticeably outperformed the standalone models. This model involved a fixed ratio between the two regularization penalty factors for the two predictor types so that the contribution of the whole-blood transcriptome is upweighted compared with the genotype. Our study suggests that one can improve the imputation of tissue-specific gene expression by combining the genotype and whole-blood expression, but the improvement can be largely dependent on the combination strategy chosen. |
format | Online Article Text |
id | pubmed-10413136 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-104131362023-08-11 Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data Jung, Sunwoo Lee, Cue Hyunkyu Sul, Jae Hoon Han, Buhm HGG Adv Article Accurate imputation of tissue-specific gene expression can be a powerful tool for understanding the biological mechanisms underlying human complex traits. Existing imputation methods can be grouped into two categories according to the types of predictors used. The first category uses genotype data, while the second category uses whole-blood expression data. Both data types can be easily collected from blood, avoiding invasive tissue biopsies. In this study, we attempted to build an optimal predictive model for imputing tissue-specific gene expression by combining the genotype and whole-blood expression data. We first evaluated the imputation performance of each standalone model (using genotype data [GEN model] and using whole-blood expression data [WBE model]) using their respective data types across 47 human tissues. The WBE model outperformed the GEN model in most tissues by a large gain. Then, we developed several combined models that leverage both types of predictors to further improve imputation performance. We tried various strategies, including utilizing a merged dataset of the two data types (MERGED models) and integrating the imputation outcomes of the two standalone models (inverse variance-weighted [IVW] models). We found that one of the MERGED models noticeably outperformed the standalone models. This model involved a fixed ratio between the two regularization penalty factors for the two predictor types so that the contribution of the whole-blood transcriptome is upweighted compared with the genotype. Our study suggests that one can improve the imputation of tissue-specific gene expression by combining the genotype and whole-blood expression, but the improvement can be largely dependent on the combination strategy chosen. Elsevier 2023-07-11 /pmc/articles/PMC10413136/ /pubmed/37576186 http://dx.doi.org/10.1016/j.xhgg.2023.100223 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Jung, Sunwoo Lee, Cue Hyunkyu Sul, Jae Hoon Han, Buhm Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title | Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title_full | Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title_fullStr | Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title_full_unstemmed | Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title_short | Building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
title_sort | building an optimal predictive model for imputing tissue-specific gene expression by combining genotype and whole-blood transcriptome data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10413136/ https://www.ncbi.nlm.nih.gov/pubmed/37576186 http://dx.doi.org/10.1016/j.xhgg.2023.100223 |
work_keys_str_mv | AT jungsunwoo buildinganoptimalpredictivemodelforimputingtissuespecificgeneexpressionbycombininggenotypeandwholebloodtranscriptomedata AT leecuehyunkyu buildinganoptimalpredictivemodelforimputingtissuespecificgeneexpressionbycombininggenotypeandwholebloodtranscriptomedata AT suljaehoon buildinganoptimalpredictivemodelforimputingtissuespecificgeneexpressionbycombininggenotypeandwholebloodtranscriptomedata AT hanbuhm buildinganoptimalpredictivemodelforimputingtissuespecificgeneexpressionbycombininggenotypeandwholebloodtranscriptomedata |