Cargando…

ISHAPE: new rapid and accurate software for haplotyping

BACKGROUND: We have developed a new haplotyping program based on the combination of an iterative multiallelic EM algorithm (IEM), bootstrap resampling and a pseudo Gibbs sampler. The use of the IEM-bootstrap procedure considerably reduces the space of possible haplotype configurations to be explored...

Descripción completa

Detalles Bibliográficos
Autores principales: Delaneau, Olivier, Coulonges, Cédric, Boelle, Pierre-Yves, Nelson, George, Spadoni, Jean-Louis, Zagury, Jean-François
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1919397/
https://www.ncbi.nlm.nih.gov/pubmed/17573965
http://dx.doi.org/10.1186/1471-2105-8-205
Descripción
Sumario:BACKGROUND: We have developed a new haplotyping program based on the combination of an iterative multiallelic EM algorithm (IEM), bootstrap resampling and a pseudo Gibbs sampler. The use of the IEM-bootstrap procedure considerably reduces the space of possible haplotype configurations to be explored, greatly reducing computation time, while the adaptation of the Gibbs sampler with a recombination model on this restricted space maintains high accuracy. On large SNP datasets (>30 SNPs), we used a segmented approach based on a specific partition-ligation strategy. We compared this software, Ishape (Iterative Segmented HAPlotyping by Em), with reference programs such as Phase, Fastphase, and PL-EM. Analogously with Phase, there are 2 versions of Ishape: Ishape1 which uses a simple coalescence model for the pseudo Gibbs sampler step, and Ishape2 which uses a recombination model instead. RESULTS: We tested the program on 2 types of real SNP datasets derived from Hapmap: adjacent SNPs (high LD) and SNPs spaced by 5 Kb (lower level of LD). In both cases, we tested 100 replicates for each size: 10, 20, 30, 40, 50, 60, and 80 SNPs. For adjacent SNPs Ishape2 is superior to the other software both in terms of speed and accuracy. For SNPs spaced by 5 Kb, Ishape2 yields similar results to Phase2.1 in terms of accuracy, and both outperform the other software. In terms of speed, Ishape2 runs about 4 times faster than Phase2.1 with 10 SNPs, and about 10 times faster with 80 SNPs. For the case of 5kb-spaced SNPs, Fastphase may run faster with more than 100 SNPs. CONCLUSION: These results show that the Ishape heuristic approach for haplotyping is very competitive in terms of accuracy and speed and deserves to be evaluated extensively for possible future widespread use.