Cargando…

SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes

There is a need to accurately call human leukocyte antigen (HLA) genes from existing short-read sequencing data, however there is no single solution that matches the gold standard of Sanger sequenced lab typing. Here we aimed to combine results from available software programs, minimizing the biases...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nordin, Jessika, Ameur, Adam, Lindblad-Toh, Kerstin, Gyllensten, Ulf, Meadows, Jennifer R. S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7170882/ https://www.ncbi.nlm.nih.gov/pubmed/31844174 http://dx.doi.org/10.1038/s41431-019-0559-2

_version_	1783523966125604864
author	Nordin, Jessika Ameur, Adam Lindblad-Toh, Kerstin Gyllensten, Ulf Meadows, Jennifer R. S.
author_facet	Nordin, Jessika Ameur, Adam Lindblad-Toh, Kerstin Gyllensten, Ulf Meadows, Jennifer R. S.
author_sort	Nordin, Jessika
collection	PubMed
description	There is a need to accurately call human leukocyte antigen (HLA) genes from existing short-read sequencing data, however there is no single solution that matches the gold standard of Sanger sequenced lab typing. Here we aimed to combine results from available software programs, minimizing the biases of applied algorithm and HLA reference. The result is a robust HLA population resource for the published 1000 Swedish genomes, and a framework for future HLA interrogation. HLA 2nd-field alleles were called using four imputation and inference methods for the classical eight genes (class I: HLA-A, HLA-B, HLA-C; class II: HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQB1, HLA-DRB1). A high confidence population set (SweHLA) was determined using an n−1 concordance rule for class I (four software) and class II (three software) alleles. Results were compared across populations and individual programs benchmarked to SweHLA. Per gene, 875 to 988 of the 1000 samples were genotyped in SweHLA; 920 samples had at least seven loci called. While a small fraction of reference alleles were common to all software (class I = 1.9% and class II = 4.1%), this did not affect the overall call rate. Gene-level concordance was high compared to European populations (>0.83%), with COX and PGF the dominant SweHLA haplotypes. We noted that 15/18 discordant alleles (delta allele frequency >2) were previously reported as disease-associated. These differences could in part explain across-study genetic replication failures, reinforcing the need to use multiple software solutions. SweHLA demonstrates a way to use existing NGS data to generate a population resource agnostic to individual HLA software biases.
format	Online Article Text
id	pubmed-7170882
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-71708822020-04-27 SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes Nordin, Jessika Ameur, Adam Lindblad-Toh, Kerstin Gyllensten, Ulf Meadows, Jennifer R. S. Eur J Hum Genet Article There is a need to accurately call human leukocyte antigen (HLA) genes from existing short-read sequencing data, however there is no single solution that matches the gold standard of Sanger sequenced lab typing. Here we aimed to combine results from available software programs, minimizing the biases of applied algorithm and HLA reference. The result is a robust HLA population resource for the published 1000 Swedish genomes, and a framework for future HLA interrogation. HLA 2nd-field alleles were called using four imputation and inference methods for the classical eight genes (class I: HLA-A, HLA-B, HLA-C; class II: HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQB1, HLA-DRB1). A high confidence population set (SweHLA) was determined using an n−1 concordance rule for class I (four software) and class II (three software) alleles. Results were compared across populations and individual programs benchmarked to SweHLA. Per gene, 875 to 988 of the 1000 samples were genotyped in SweHLA; 920 samples had at least seven loci called. While a small fraction of reference alleles were common to all software (class I = 1.9% and class II = 4.1%), this did not affect the overall call rate. Gene-level concordance was high compared to European populations (>0.83%), with COX and PGF the dominant SweHLA haplotypes. We noted that 15/18 discordant alleles (delta allele frequency >2) were previously reported as disease-associated. These differences could in part explain across-study genetic replication failures, reinforcing the need to use multiple software solutions. SweHLA demonstrates a way to use existing NGS data to generate a population resource agnostic to individual HLA software biases. Springer International Publishing 2019-12-16 2020-05 /pmc/articles/PMC7170882/ /pubmed/31844174 http://dx.doi.org/10.1038/s41431-019-0559-2 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Nordin, Jessika Ameur, Adam Lindblad-Toh, Kerstin Gyllensten, Ulf Meadows, Jennifer R. S. SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title	SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title_full	SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title_fullStr	SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title_full_unstemmed	SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title_short	SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes
title_sort	swehla: the high confidence hla typing bio-resource drawn from 1000 swedish genomes
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7170882/ https://www.ncbi.nlm.nih.gov/pubmed/31844174 http://dx.doi.org/10.1038/s41431-019-0559-2
work_keys_str_mv	AT nordinjessika swehlathehighconfidencehlatypingbioresourcedrawnfrom1000swedishgenomes AT ameuradam swehlathehighconfidencehlatypingbioresourcedrawnfrom1000swedishgenomes AT lindbladtohkerstin swehlathehighconfidencehlatypingbioresourcedrawnfrom1000swedishgenomes AT gyllenstenulf swehlathehighconfidencehlatypingbioresourcedrawnfrom1000swedishgenomes AT meadowsjenniferrs swehlathehighconfidencehlatypingbioresourcedrawnfrom1000swedishgenomes

SweHLA: the high confidence HLA typing bio-resource drawn from 1000 Swedish genomes

Ejemplares similares