Cargando…

HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting

Correctly matching the HLA haplotypes of donor and recipient is essential to the success of allogenic hematopoietic stem cell transplantation. Current HLA typing methods rely on targeted testing of recognized antigens or sequences. Despite advances in Next Generation Sequencing, general high through...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Hyunsung John, Pourmand, Nader
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3696101/
https://www.ncbi.nlm.nih.gov/pubmed/23840783
http://dx.doi.org/10.1371/journal.pone.0067885
_version_ 1782476288518258688
author Kim, Hyunsung John
Pourmand, Nader
author_facet Kim, Hyunsung John
Pourmand, Nader
author_sort Kim, Hyunsung John
collection PubMed
description Correctly matching the HLA haplotypes of donor and recipient is essential to the success of allogenic hematopoietic stem cell transplantation. Current HLA typing methods rely on targeted testing of recognized antigens or sequences. Despite advances in Next Generation Sequencing, general high throughput transcriptome sequencing is currently underutilized for HLA haplotyping due to the central difficulty in aligning sequences within this highly variable region. Here we present the method, HLAforest, that can accurately predict HLA haplotype by hierarchically weighting reads and using an iterative, greedy, top down pruning technique. HLAforest correctly predicts >99% of allele group level (2 digit) haplotypes and 93% of peptide-level (4 digit) haplotypes of the most diverse HLA genes in simulations with read lengths and error rates modeling currently available sequencing technology. The method is very robust to sequencing error and can predict 99% of allele-group level haplotypes with substitution rates as high as 8.8%. When applied to data generated from a trio of cell lines, HLAforest corroborated PCR-based HLA haplotyping methods and accurately predicted 16/18 (89%) major class I genes for a daughter–father-mother trio at the peptide level. Major class II genes were predicted with 100% concordance between the daughter–father-mother trio. In fifty HapMap samples with paired end reads just 37 nucleotides long, HLAforest predicted 96.5% of allele group level HLA haplotypes correctly and 83% of peptide level haplotypes correctly. In sixteen RNAseq samples with limited coverage across HLA genes, HLAforest predicted 97.7% of allele group level haplotypes and 85% of peptide level haplotypes correctly.
format Online
Article
Text
id pubmed-3696101
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36961012013-07-09 HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting Kim, Hyunsung John Pourmand, Nader PLoS One Research Article Correctly matching the HLA haplotypes of donor and recipient is essential to the success of allogenic hematopoietic stem cell transplantation. Current HLA typing methods rely on targeted testing of recognized antigens or sequences. Despite advances in Next Generation Sequencing, general high throughput transcriptome sequencing is currently underutilized for HLA haplotyping due to the central difficulty in aligning sequences within this highly variable region. Here we present the method, HLAforest, that can accurately predict HLA haplotype by hierarchically weighting reads and using an iterative, greedy, top down pruning technique. HLAforest correctly predicts >99% of allele group level (2 digit) haplotypes and 93% of peptide-level (4 digit) haplotypes of the most diverse HLA genes in simulations with read lengths and error rates modeling currently available sequencing technology. The method is very robust to sequencing error and can predict 99% of allele-group level haplotypes with substitution rates as high as 8.8%. When applied to data generated from a trio of cell lines, HLAforest corroborated PCR-based HLA haplotyping methods and accurately predicted 16/18 (89%) major class I genes for a daughter–father-mother trio at the peptide level. Major class II genes were predicted with 100% concordance between the daughter–father-mother trio. In fifty HapMap samples with paired end reads just 37 nucleotides long, HLAforest predicted 96.5% of allele group level HLA haplotypes correctly and 83% of peptide level haplotypes correctly. In sixteen RNAseq samples with limited coverage across HLA genes, HLAforest predicted 97.7% of allele group level haplotypes and 85% of peptide level haplotypes correctly. Public Library of Science 2013-06-28 /pmc/articles/PMC3696101/ /pubmed/23840783 http://dx.doi.org/10.1371/journal.pone.0067885 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Kim, Hyunsung John
Pourmand, Nader
HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title_full HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title_fullStr HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title_full_unstemmed HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title_short HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting
title_sort hla haplotyping from rna-seq data using hierarchical read weighting
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3696101/
https://www.ncbi.nlm.nih.gov/pubmed/23840783
http://dx.doi.org/10.1371/journal.pone.0067885
work_keys_str_mv AT kimhyunsungjohn hlahaplotypingfromrnaseqdatausinghierarchicalreadweighting
AT pourmandnader hlahaplotypingfromrnaseqdatausinghierarchicalreadweighting