Cargando…

Evaluation of two methods for computational HLA haplotypes inference using a real dataset

BACKGROUND: HLA haplotype analysis has been used in population genetics and in the investigation of disease-susceptibility locus, due to its high polymorphism. Several methods for inferring haplotype genotypic data have been proposed, but it is unclear how accurate each of the methods is or which me...

Descripción completa

Detalles Bibliográficos
Autores principales: Bettencourt, Bruno F, Santos, Margarida R, Fialho, Raquel N, Couto, Ana R, Peixoto, Maria J, Pinheiro, João P, Spínola, Hélder, Mora, Marian G, Santos, Cristina, Brehm, António, Bruges-Armas, Jácome
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2268655/
https://www.ncbi.nlm.nih.gov/pubmed/18230173
http://dx.doi.org/10.1186/1471-2105-9-68
_version_ 1782151678720475136
author Bettencourt, Bruno F
Santos, Margarida R
Fialho, Raquel N
Couto, Ana R
Peixoto, Maria J
Pinheiro, João P
Spínola, Hélder
Mora, Marian G
Santos, Cristina
Brehm, António
Bruges-Armas, Jácome
author_facet Bettencourt, Bruno F
Santos, Margarida R
Fialho, Raquel N
Couto, Ana R
Peixoto, Maria J
Pinheiro, João P
Spínola, Hélder
Mora, Marian G
Santos, Cristina
Brehm, António
Bruges-Armas, Jácome
author_sort Bettencourt, Bruno F
collection PubMed
description BACKGROUND: HLA haplotype analysis has been used in population genetics and in the investigation of disease-susceptibility locus, due to its high polymorphism. Several methods for inferring haplotype genotypic data have been proposed, but it is unclear how accurate each of the methods is or which method is superior. The accuracy of two of the leading methods of computational haplotype inference – Expectation-Maximization algorithm based (implemented in Arlequin V3.0) and Bayesian algorithm based (implemented in PHASE V2.1.1) – was compared using a set of 122 HLA haplotypes (A-B-Cw-DQB1-DRB1) determined through direct counting. The accuracy was measured with the Mean Squared Error (MSE), Similarity Index (I(F)) and Haplotype Identification Index (I(H)). RESULTS: None of the methods inferred all of the known haplotypes and some differences were observed in the accuracy of the two methods in terms of both haplotype determination and haplotype frequencies estimation. Working with haplotypes composed by low polymorphic sites, present in more than one individual, increased the confidence in the assignment of haplotypes and in the estimation of the haplotype frequencies generated by both programs. CONCLUSION: The PHASE v2.1.1 implemented method had the best overall performance both in haplotype construction and frequency calculation, although the differences between the two methods were insubstantial. To our knowledge this was the first work aiming to test statistical methods using real haplotypic data from the HLA region.
format Text
id pubmed-2268655
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22686552008-03-18 Evaluation of two methods for computational HLA haplotypes inference using a real dataset Bettencourt, Bruno F Santos, Margarida R Fialho, Raquel N Couto, Ana R Peixoto, Maria J Pinheiro, João P Spínola, Hélder Mora, Marian G Santos, Cristina Brehm, António Bruges-Armas, Jácome BMC Bioinformatics Research Article BACKGROUND: HLA haplotype analysis has been used in population genetics and in the investigation of disease-susceptibility locus, due to its high polymorphism. Several methods for inferring haplotype genotypic data have been proposed, but it is unclear how accurate each of the methods is or which method is superior. The accuracy of two of the leading methods of computational haplotype inference – Expectation-Maximization algorithm based (implemented in Arlequin V3.0) and Bayesian algorithm based (implemented in PHASE V2.1.1) – was compared using a set of 122 HLA haplotypes (A-B-Cw-DQB1-DRB1) determined through direct counting. The accuracy was measured with the Mean Squared Error (MSE), Similarity Index (I(F)) and Haplotype Identification Index (I(H)). RESULTS: None of the methods inferred all of the known haplotypes and some differences were observed in the accuracy of the two methods in terms of both haplotype determination and haplotype frequencies estimation. Working with haplotypes composed by low polymorphic sites, present in more than one individual, increased the confidence in the assignment of haplotypes and in the estimation of the haplotype frequencies generated by both programs. CONCLUSION: The PHASE v2.1.1 implemented method had the best overall performance both in haplotype construction and frequency calculation, although the differences between the two methods were insubstantial. To our knowledge this was the first work aiming to test statistical methods using real haplotypic data from the HLA region. BioMed Central 2008-01-29 /pmc/articles/PMC2268655/ /pubmed/18230173 http://dx.doi.org/10.1186/1471-2105-9-68 Text en Copyright © 2008 Bettencourt et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bettencourt, Bruno F
Santos, Margarida R
Fialho, Raquel N
Couto, Ana R
Peixoto, Maria J
Pinheiro, João P
Spínola, Hélder
Mora, Marian G
Santos, Cristina
Brehm, António
Bruges-Armas, Jácome
Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title_full Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title_fullStr Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title_full_unstemmed Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title_short Evaluation of two methods for computational HLA haplotypes inference using a real dataset
title_sort evaluation of two methods for computational hla haplotypes inference using a real dataset
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2268655/
https://www.ncbi.nlm.nih.gov/pubmed/18230173
http://dx.doi.org/10.1186/1471-2105-9-68
work_keys_str_mv AT bettencourtbrunof evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT santosmargaridar evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT fialhoraqueln evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT coutoanar evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT peixotomariaj evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT pinheirojoaop evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT spinolahelder evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT moramariang evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT santoscristina evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT brehmantonio evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset
AT brugesarmasjacome evaluationoftwomethodsforcomputationalhlahaplotypesinferenceusingarealdataset