Cargando…

HaploRec: efficient and accurate large-scale reconstruction of haplotypes

BACKGROUND: Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that current practical laboratory methods do not give haplotype information. Estimation of phased haplotypes of unrela...

Descripción completa

Detalles Bibliográficos
Autores principales: Eronen, Lauri, Geerts, Floris, Toivonen, Hannu
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766938/
https://www.ncbi.nlm.nih.gov/pubmed/17187677
http://dx.doi.org/10.1186/1471-2105-7-542
_version_ 1782131672678924288
author Eronen, Lauri
Geerts, Floris
Toivonen, Hannu
author_facet Eronen, Lauri
Geerts, Floris
Toivonen, Hannu
author_sort Eronen, Lauri
collection PubMed
description BACKGROUND: Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that current practical laboratory methods do not give haplotype information. Estimation of phased haplotypes of unrelated individuals given their unphased genotypes is known as the haplotype reconstruction or phasing problem. RESULTS: We define three novel statistical models and give an efficient algorithm for haplotype reconstruction, jointly called HaploRec. HaploRec is based on exploiting local regularities conserved in haplotypes: it reconstructs haplotypes so that they have maximal local coherence. This approach – not assuming statistical dependence for remotely located markers – has two useful properties: it is well-suited for sparse marker maps, such as those used in gene mapping, and it can actually take advantage of long maps. CONCLUSION: Our experimental results with simulated and real data show that HaploRec is a powerful method for the large scale haplotyping needed in association studies. With sample sizes large enough for gene mapping it appeared to be the best compared to all other tested methods (Phase, fastPhase, PL-EM, Snphap, Gerbil; simulated data), with small samples it was competitive with the best available methods (real data). HaploRec is several orders of magnitude faster than Phase and comparable to the other methods; the running times are roughly linear in the number of subjects and the number of markers. HaploRec is publicly available at .
format Text
id pubmed-1766938
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17669382007-01-16 HaploRec: efficient and accurate large-scale reconstruction of haplotypes Eronen, Lauri Geerts, Floris Toivonen, Hannu BMC Bioinformatics Methodology Article BACKGROUND: Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that current practical laboratory methods do not give haplotype information. Estimation of phased haplotypes of unrelated individuals given their unphased genotypes is known as the haplotype reconstruction or phasing problem. RESULTS: We define three novel statistical models and give an efficient algorithm for haplotype reconstruction, jointly called HaploRec. HaploRec is based on exploiting local regularities conserved in haplotypes: it reconstructs haplotypes so that they have maximal local coherence. This approach – not assuming statistical dependence for remotely located markers – has two useful properties: it is well-suited for sparse marker maps, such as those used in gene mapping, and it can actually take advantage of long maps. CONCLUSION: Our experimental results with simulated and real data show that HaploRec is a powerful method for the large scale haplotyping needed in association studies. With sample sizes large enough for gene mapping it appeared to be the best compared to all other tested methods (Phase, fastPhase, PL-EM, Snphap, Gerbil; simulated data), with small samples it was competitive with the best available methods (real data). HaploRec is several orders of magnitude faster than Phase and comparable to the other methods; the running times are roughly linear in the number of subjects and the number of markers. HaploRec is publicly available at . BioMed Central 2006-12-22 /pmc/articles/PMC1766938/ /pubmed/17187677 http://dx.doi.org/10.1186/1471-2105-7-542 Text en Copyright © 2006 Eronen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Eronen, Lauri
Geerts, Floris
Toivonen, Hannu
HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title_full HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title_fullStr HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title_full_unstemmed HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title_short HaploRec: efficient and accurate large-scale reconstruction of haplotypes
title_sort haplorec: efficient and accurate large-scale reconstruction of haplotypes
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1766938/
https://www.ncbi.nlm.nih.gov/pubmed/17187677
http://dx.doi.org/10.1186/1471-2105-7-542
work_keys_str_mv AT eronenlauri haplorecefficientandaccuratelargescalereconstructionofhaplotypes
AT geertsfloris haplorecefficientandaccuratelargescalereconstructionofhaplotypes
AT toivonenhannu haplorecefficientandaccuratelargescalereconstructionofhaplotypes