Cargando…

Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs

BACKGROUND: Numerous immune-mediated diseases have been associated with the class I and II HLA genes located within the major histocompatibility complex (MHC) consisting of highly polymorphic alleles encoded by the HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1 loci. Genotyping for HLA alleles is complex and...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xinyi Cindy, Li, Shuying Sue, Wang, Hongwei, Hansen, John A, Zhao, Lue Ping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111398/
https://www.ncbi.nlm.nih.gov/pubmed/21518453
http://dx.doi.org/10.1186/1471-2156-12-39
_version_ 1782205619764199424
author Zhang, Xinyi Cindy
Li, Shuying Sue
Wang, Hongwei
Hansen, John A
Zhao, Lue Ping
author_facet Zhang, Xinyi Cindy
Li, Shuying Sue
Wang, Hongwei
Hansen, John A
Zhao, Lue Ping
author_sort Zhang, Xinyi Cindy
collection PubMed
description BACKGROUND: Numerous immune-mediated diseases have been associated with the class I and II HLA genes located within the major histocompatibility complex (MHC) consisting of highly polymorphic alleles encoded by the HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1 loci. Genotyping for HLA alleles is complex and relatively expensive. Recent studies have demonstrated the feasibility of predicting HLA alleles, using MHC SNPs inside and outside of HLA that are typically included in SNP arrays and are commonly available in genome-wide association studies (GWAS). We have recently described a novel method that is complementary to the previous methods, for accurately predicting HLA alleles using unphased flanking SNPs genotypes. In this manuscript, we address several practical issues relevant to the application of this methodology. RESULTS: Applying this new methodology to three large independent study cohorts, we have evaluated the performance of the predictive models in ethnically diverse populations. Specifically, we have found that utilizing imputed in addition to genotyped SNPs generally yields comparable if not better performance in prediction accuracies. Our evaluation also supports the idea that predictive models trained on one population are transferable to other populations of the same ethnicity. Further, when the training set includes multi-ethnic populations, the resulting models are reliable and perform well for the same subpopulations across all HLA genes. In contrast, the predictive models built from single ethnic populations have superior performance within the same ethnic population, but are not likely to perform well in other ethnic populations. CONCLUSIONS: The empirical explorations reported here provide further evidence in support of the application of this approach for predicting HLA alleles with GWAS-derived SNP data. Utilizing all available samples, we have built "state of the art" predictive models for HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1. The HLA allele predictive models, along with the program used to carry out the prediction, are available on our website.
format Online
Article
Text
id pubmed-3111398
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31113982011-06-10 Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs Zhang, Xinyi Cindy Li, Shuying Sue Wang, Hongwei Hansen, John A Zhao, Lue Ping BMC Genet Research Article BACKGROUND: Numerous immune-mediated diseases have been associated with the class I and II HLA genes located within the major histocompatibility complex (MHC) consisting of highly polymorphic alleles encoded by the HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1 loci. Genotyping for HLA alleles is complex and relatively expensive. Recent studies have demonstrated the feasibility of predicting HLA alleles, using MHC SNPs inside and outside of HLA that are typically included in SNP arrays and are commonly available in genome-wide association studies (GWAS). We have recently described a novel method that is complementary to the previous methods, for accurately predicting HLA alleles using unphased flanking SNPs genotypes. In this manuscript, we address several practical issues relevant to the application of this methodology. RESULTS: Applying this new methodology to three large independent study cohorts, we have evaluated the performance of the predictive models in ethnically diverse populations. Specifically, we have found that utilizing imputed in addition to genotyped SNPs generally yields comparable if not better performance in prediction accuracies. Our evaluation also supports the idea that predictive models trained on one population are transferable to other populations of the same ethnicity. Further, when the training set includes multi-ethnic populations, the resulting models are reliable and perform well for the same subpopulations across all HLA genes. In contrast, the predictive models built from single ethnic populations have superior performance within the same ethnic population, but are not likely to perform well in other ethnic populations. CONCLUSIONS: The empirical explorations reported here provide further evidence in support of the application of this approach for predicting HLA alleles with GWAS-derived SNP data. Utilizing all available samples, we have built "state of the art" predictive models for HLA-A, -B, -C, -DRB1, -DQB1 and -DPB1. The HLA allele predictive models, along with the program used to carry out the prediction, are available on our website. BioMed Central 2011-04-25 /pmc/articles/PMC3111398/ /pubmed/21518453 http://dx.doi.org/10.1186/1471-2156-12-39 Text en Copyright ©2011 Zhang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhang, Xinyi Cindy
Li, Shuying Sue
Wang, Hongwei
Hansen, John A
Zhao, Lue Ping
Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title_full Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title_fullStr Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title_full_unstemmed Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title_short Empirical evaluations of analytical issues arising from predicting HLA alleles using multiple SNPs
title_sort empirical evaluations of analytical issues arising from predicting hla alleles using multiple snps
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111398/
https://www.ncbi.nlm.nih.gov/pubmed/21518453
http://dx.doi.org/10.1186/1471-2156-12-39
work_keys_str_mv AT zhangxinyicindy empiricalevaluationsofanalyticalissuesarisingfrompredictinghlaallelesusingmultiplesnps
AT lishuyingsue empiricalevaluationsofanalyticalissuesarisingfrompredictinghlaallelesusingmultiplesnps
AT wanghongwei empiricalevaluationsofanalyticalissuesarisingfrompredictinghlaallelesusingmultiplesnps
AT hansenjohna empiricalevaluationsofanalyticalissuesarisingfrompredictinghlaallelesusingmultiplesnps
AT zhaolueping empiricalevaluationsofanalyticalissuesarisingfrompredictinghlaallelesusingmultiplesnps