Cargando…

A fault-tolerant method for HLA typing with PacBio data

BACKGROUND: Human leukocyte antigen (HLA) genes are critical genes involved in important biomedical aspects, including organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most polymorphic genes in humans and the difference between two alleles is only a si...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Chia-Jung, Chen, Pei-Lung, Yang, Wei-Shiung, Chao, Kun-Mao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4161847/
https://www.ncbi.nlm.nih.gov/pubmed/25183223
http://dx.doi.org/10.1186/1471-2105-15-296
_version_ 1782334605275168768
author Chang, Chia-Jung
Chen, Pei-Lung
Yang, Wei-Shiung
Chao, Kun-Mao
author_facet Chang, Chia-Jung
Chen, Pei-Lung
Yang, Wei-Shiung
Chao, Kun-Mao
author_sort Chang, Chia-Jung
collection PubMed
description BACKGROUND: Human leukocyte antigen (HLA) genes are critical genes involved in important biomedical aspects, including organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most polymorphic genes in humans and the difference between two alleles is only a single base pair substitution in many cases. The next generation sequencing (NGS) technologies could be used for high throughput HLA typing but in silico methods are still needed to correctly assign the alleles of a sample. Computer scientists have developed such methods for various NGS platforms, such as Illumina, Roche 454 and Ion Torrent, based on the characteristics of the reads they generate. However, the method for PacBio reads was less addressed, probably owing to its high error rates. The PacBio system has the longest read length among available NGS platforms, and therefore is the only platform capable of having exon 2 and exon 3 of HLA genes on the same read to unequivocally solve the ambiguity problem caused by the “phasing” issue. RESULTS: We proposed a new method BayesTyping1 to assign HLA alleles for PacBio circular consensus sequencing reads using Bayes’ theorem. The method was applied to simulated data of the three loci HLA-A, HLA-B and HLA-DRB1. The experimental results showed its capability to tolerate the disturbance of sequencing errors and external noise reads. CONCLUSIONS: The BayesTyping1 method could overcome the problems of HLA typing using PacBio reads, which mostly arise from sequencing errors of PacBio reads and the divergence of HLA genes, to some extent. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-296) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4161847
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41618472014-09-13 A fault-tolerant method for HLA typing with PacBio data Chang, Chia-Jung Chen, Pei-Lung Yang, Wei-Shiung Chao, Kun-Mao BMC Bioinformatics Methodology Article BACKGROUND: Human leukocyte antigen (HLA) genes are critical genes involved in important biomedical aspects, including organ transplantation, autoimmune diseases and infectious diseases. The gene family contains the most polymorphic genes in humans and the difference between two alleles is only a single base pair substitution in many cases. The next generation sequencing (NGS) technologies could be used for high throughput HLA typing but in silico methods are still needed to correctly assign the alleles of a sample. Computer scientists have developed such methods for various NGS platforms, such as Illumina, Roche 454 and Ion Torrent, based on the characteristics of the reads they generate. However, the method for PacBio reads was less addressed, probably owing to its high error rates. The PacBio system has the longest read length among available NGS platforms, and therefore is the only platform capable of having exon 2 and exon 3 of HLA genes on the same read to unequivocally solve the ambiguity problem caused by the “phasing” issue. RESULTS: We proposed a new method BayesTyping1 to assign HLA alleles for PacBio circular consensus sequencing reads using Bayes’ theorem. The method was applied to simulated data of the three loci HLA-A, HLA-B and HLA-DRB1. The experimental results showed its capability to tolerate the disturbance of sequencing errors and external noise reads. CONCLUSIONS: The BayesTyping1 method could overcome the problems of HLA typing using PacBio reads, which mostly arise from sequencing errors of PacBio reads and the divergence of HLA genes, to some extent. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2105-15-296) contains supplementary material, which is available to authorized users. BioMed Central 2014-09-03 /pmc/articles/PMC4161847/ /pubmed/25183223 http://dx.doi.org/10.1186/1471-2105-15-296 Text en © Chang et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Chang, Chia-Jung
Chen, Pei-Lung
Yang, Wei-Shiung
Chao, Kun-Mao
A fault-tolerant method for HLA typing with PacBio data
title A fault-tolerant method for HLA typing with PacBio data
title_full A fault-tolerant method for HLA typing with PacBio data
title_fullStr A fault-tolerant method for HLA typing with PacBio data
title_full_unstemmed A fault-tolerant method for HLA typing with PacBio data
title_short A fault-tolerant method for HLA typing with PacBio data
title_sort fault-tolerant method for hla typing with pacbio data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4161847/
https://www.ncbi.nlm.nih.gov/pubmed/25183223
http://dx.doi.org/10.1186/1471-2105-15-296
work_keys_str_mv AT changchiajung afaulttolerantmethodforhlatypingwithpacbiodata
AT chenpeilung afaulttolerantmethodforhlatypingwithpacbiodata
AT yangweishiung afaulttolerantmethodforhlatypingwithpacbiodata
AT chaokunmao afaulttolerantmethodforhlatypingwithpacbiodata
AT changchiajung faulttolerantmethodforhlatypingwithpacbiodata
AT chenpeilung faulttolerantmethodforhlatypingwithpacbiodata
AT yangweishiung faulttolerantmethodforhlatypingwithpacbiodata
AT chaokunmao faulttolerantmethodforhlatypingwithpacbiodata