Cargando…
Shape-IT: new rapid and accurate algorithm for haplotype inference
BACKGROUND: We have developed a new computational algorithm, Shape-IT, to infer haplotypes under the genetic model of coalescence with recombination developed by Stephens et al in Phase v2.1. It runs much faster than Phase v2.1 while exhibiting the same accuracy. The major algorithmic improvements r...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2647951/ https://www.ncbi.nlm.nih.gov/pubmed/19087329 http://dx.doi.org/10.1186/1471-2105-9-540 |
_version_ | 1782164955895693312 |
---|---|
author | Delaneau, Olivier Coulonges, Cédric Zagury, Jean-François |
author_facet | Delaneau, Olivier Coulonges, Cédric Zagury, Jean-François |
author_sort | Delaneau, Olivier |
collection | PubMed |
description | BACKGROUND: We have developed a new computational algorithm, Shape-IT, to infer haplotypes under the genetic model of coalescence with recombination developed by Stephens et al in Phase v2.1. It runs much faster than Phase v2.1 while exhibiting the same accuracy. The major algorithmic improvements rely on the use of binary trees to represent the sets of candidate haplotypes for each individual. These binary tree representations: (1) speed up the computations of posterior probabilities of the haplotypes by avoiding the redundant operations made in Phase v2.1, and (2) overcome the exponential aspect of the haplotypes inference problem by the smart exploration of the most plausible pathways (ie. haplotypes) in the binary trees. RESULTS: Our results show that Shape-IT is several orders of magnitude faster than Phase v2.1 while being as accurate. For instance, Shape-IT runs 50 times faster than Phase v2.1 to compute the haplotypes of 200 subjects on 6,000 segments of 50 SNPs extracted from a standard Illumina 300 K chip (13 days instead of 630 days). We also compared Shape-IT with other widely used software, Gerbil, PL-EM, Fastphase, 2SNP, and Ishape in various tests: Shape-IT and Phase v2.1 were the most accurate in all cases, followed by Ishape and Fastphase. As a matter of speed, Shape-IT was faster than Ishape and Fastphase for datasets smaller than 100 SNPs, but Fastphase became faster -but still less accurate- to infer haplotypes on larger SNP datasets. CONCLUSION: Shape-IT deserves to be extensively used for regular haplotype inference but also in the context of the new high-throughput genotyping chips since it permits to fit the genetic model of Phase v2.1 on large datasets. This new algorithm based on tree representations could be used in other HMM-based haplotype inference software and may apply more largely to other fields using HMM. |
format | Text |
id | pubmed-2647951 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26479512009-02-26 Shape-IT: new rapid and accurate algorithm for haplotype inference Delaneau, Olivier Coulonges, Cédric Zagury, Jean-François BMC Bioinformatics Research Article BACKGROUND: We have developed a new computational algorithm, Shape-IT, to infer haplotypes under the genetic model of coalescence with recombination developed by Stephens et al in Phase v2.1. It runs much faster than Phase v2.1 while exhibiting the same accuracy. The major algorithmic improvements rely on the use of binary trees to represent the sets of candidate haplotypes for each individual. These binary tree representations: (1) speed up the computations of posterior probabilities of the haplotypes by avoiding the redundant operations made in Phase v2.1, and (2) overcome the exponential aspect of the haplotypes inference problem by the smart exploration of the most plausible pathways (ie. haplotypes) in the binary trees. RESULTS: Our results show that Shape-IT is several orders of magnitude faster than Phase v2.1 while being as accurate. For instance, Shape-IT runs 50 times faster than Phase v2.1 to compute the haplotypes of 200 subjects on 6,000 segments of 50 SNPs extracted from a standard Illumina 300 K chip (13 days instead of 630 days). We also compared Shape-IT with other widely used software, Gerbil, PL-EM, Fastphase, 2SNP, and Ishape in various tests: Shape-IT and Phase v2.1 were the most accurate in all cases, followed by Ishape and Fastphase. As a matter of speed, Shape-IT was faster than Ishape and Fastphase for datasets smaller than 100 SNPs, but Fastphase became faster -but still less accurate- to infer haplotypes on larger SNP datasets. CONCLUSION: Shape-IT deserves to be extensively used for regular haplotype inference but also in the context of the new high-throughput genotyping chips since it permits to fit the genetic model of Phase v2.1 on large datasets. This new algorithm based on tree representations could be used in other HMM-based haplotype inference software and may apply more largely to other fields using HMM. BioMed Central 2008-12-16 /pmc/articles/PMC2647951/ /pubmed/19087329 http://dx.doi.org/10.1186/1471-2105-9-540 Text en Copyright © 2008 Delaneau et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Delaneau, Olivier Coulonges, Cédric Zagury, Jean-François Shape-IT: new rapid and accurate algorithm for haplotype inference |
title | Shape-IT: new rapid and accurate algorithm for haplotype inference |
title_full | Shape-IT: new rapid and accurate algorithm for haplotype inference |
title_fullStr | Shape-IT: new rapid and accurate algorithm for haplotype inference |
title_full_unstemmed | Shape-IT: new rapid and accurate algorithm for haplotype inference |
title_short | Shape-IT: new rapid and accurate algorithm for haplotype inference |
title_sort | shape-it: new rapid and accurate algorithm for haplotype inference |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2647951/ https://www.ncbi.nlm.nih.gov/pubmed/19087329 http://dx.doi.org/10.1186/1471-2105-9-540 |
work_keys_str_mv | AT delaneauolivier shapeitnewrapidandaccuratealgorithmforhaplotypeinference AT coulongescedric shapeitnewrapidandaccuratealgorithmforhaplotypeinference AT zaguryjeanfrancois shapeitnewrapidandaccuratealgorithmforhaplotypeinference |