Cargando…

A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns

The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inf...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jihua, Chen, Guo-Bo, Zhi, Degui, Liu, Nianjun, Zhang, Kui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129397/
https://www.ncbi.nlm.nih.gov/pubmed/25161663
http://dx.doi.org/10.3389/fgene.2014.00267
_version_ 1782330229450080256
author Wu, Jihua
Chen, Guo-Bo
Zhi, Degui
Liu, Nianjun
Zhang, Kui
author_facet Wu, Jihua
Chen, Guo-Bo
Zhi, Degui
Liu, Nianjun
Zhang, Kui
author_sort Wu, Jihua
collection PubMed
description The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inference for these genes is becoming more challenging due to such large portion of missing information. Meantime, many haplotypes and partial haplotype patterns have been previously identified due to tight linkage disequilibrium (LD) among these clustered genes thus can be incorporated to facilitate haplotype inference. In this paper, we developed a hidden Markov model (HMM) based method that can incorporate identified haplotypes or partial haplotype patterns for haplotype inference from present-absent data of clustered genes (e.g., KIR genes). We compared its performance with an expectation maximization (EM) based method previously developed in terms of haplotype assignments and haplotype frequency estimation through extensive simulations for KIR genes. The simulation results showed that the new HMM based method outperformed the previous method when some incorrect haplotypes were included as identified haplotypes and/or the standard deviation of haplotype frequencies were small. We also compared the performance of our method with two methods that do not use previously identified haplotypes and haplotype patterns, including an EM based method, HPALORE, and a HMM based method, MaCH. Our simulation results showed that the incorporation of identified haplotypes and partial haplotype patterns can improve accuracy for haplotype inference. The new software package HaploHMM is available and can be downloaded at http://www.soph.uab.edu/ssg/files/People/KZhang/HaploHMM/haplohmm-index.html.
format Online
Article
Text
id pubmed-4129397
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-41293972014-08-26 A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns Wu, Jihua Chen, Guo-Bo Zhi, Degui Liu, Nianjun Zhang, Kui Front Genet Genetics The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inference for these genes is becoming more challenging due to such large portion of missing information. Meantime, many haplotypes and partial haplotype patterns have been previously identified due to tight linkage disequilibrium (LD) among these clustered genes thus can be incorporated to facilitate haplotype inference. In this paper, we developed a hidden Markov model (HMM) based method that can incorporate identified haplotypes or partial haplotype patterns for haplotype inference from present-absent data of clustered genes (e.g., KIR genes). We compared its performance with an expectation maximization (EM) based method previously developed in terms of haplotype assignments and haplotype frequency estimation through extensive simulations for KIR genes. The simulation results showed that the new HMM based method outperformed the previous method when some incorrect haplotypes were included as identified haplotypes and/or the standard deviation of haplotype frequencies were small. We also compared the performance of our method with two methods that do not use previously identified haplotypes and haplotype patterns, including an EM based method, HPALORE, and a HMM based method, MaCH. Our simulation results showed that the incorporation of identified haplotypes and partial haplotype patterns can improve accuracy for haplotype inference. The new software package HaploHMM is available and can be downloaded at http://www.soph.uab.edu/ssg/files/People/KZhang/HaploHMM/haplohmm-index.html. Frontiers Media S.A. 2014-08-12 /pmc/articles/PMC4129397/ /pubmed/25161663 http://dx.doi.org/10.3389/fgene.2014.00267 Text en Copyright © 2014 Wu, Chen, Zhi, Liu and Zhang. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wu, Jihua
Chen, Guo-Bo
Zhi, Degui
Liu, Nianjun
Zhang, Kui
A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_full A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_fullStr A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_full_unstemmed A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_short A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_sort hidden markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129397/
https://www.ncbi.nlm.nih.gov/pubmed/25161663
http://dx.doi.org/10.3389/fgene.2014.00267
work_keys_str_mv AT wujihua ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT chenguobo ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT zhidegui ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT liunianjun ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT zhangkui ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT wujihua hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT chenguobo hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT zhidegui hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT liunianjun hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT zhangkui hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns