Cargando…

A fast and accurate algorithm for single individual haplotyping

BACKGROUND: Due to the difficulty in separating two (paternal and maternal) copies of a chromosome, most published human genome sequences only provide genotype information, i.e., the mixed information of the underlying two haplotypes. However, phased haplotype information is needed to completely und...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Minzhu, Wang, Jianxin, Jiang, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521186/
https://www.ncbi.nlm.nih.gov/pubmed/23282221
http://dx.doi.org/10.1186/1752-0509-6-S2-S8
_version_ 1782252900751245312
author Xie, Minzhu
Wang, Jianxin
Jiang, Tao
author_facet Xie, Minzhu
Wang, Jianxin
Jiang, Tao
author_sort Xie, Minzhu
collection PubMed
description BACKGROUND: Due to the difficulty in separating two (paternal and maternal) copies of a chromosome, most published human genome sequences only provide genotype information, i.e., the mixed information of the underlying two haplotypes. However, phased haplotype information is needed to completely understand complex genetic polymorphisms and to increase the power of genome-wide association studies for complex diseases. With the rapid development of DNA sequencing technologies, reconstructing a pair of haplotypes from an individual's aligned DNA fragments by computer algorithms (i.e., Single Individual Haplotyping) has become a practical haplotyping approach. RESULTS: In the paper, we combine two measures "errors corrected" and "fragments cut" and propose a new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping. The model generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values. To solve the model, we design a heuristic dynamic programming algorithm H-BOP. By limiting the number of intermediate solutions at each iteration to an appropriately chosen small integer k, H-BOP is able to solve the model efficiently. CONCLUSIONS: Extensive experimental results on simulated and real data show that when k = 8, H-BOP is generally faster and more accurate than a recent state-of-art algorithm ReFHap in haplotype reconstruction. The running time of H-BOP is linearly dependent on some of the key parameters controlling the input size and H-BOP scales well to large input data. The code of H-BOP is available to the public for free upon request to the corresponding author.
format Online
Article
Text
id pubmed-3521186
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35211862012-12-14 A fast and accurate algorithm for single individual haplotyping Xie, Minzhu Wang, Jianxin Jiang, Tao BMC Syst Biol Proceedings BACKGROUND: Due to the difficulty in separating two (paternal and maternal) copies of a chromosome, most published human genome sequences only provide genotype information, i.e., the mixed information of the underlying two haplotypes. However, phased haplotype information is needed to completely understand complex genetic polymorphisms and to increase the power of genome-wide association studies for complex diseases. With the rapid development of DNA sequencing technologies, reconstructing a pair of haplotypes from an individual's aligned DNA fragments by computer algorithms (i.e., Single Individual Haplotyping) has become a practical haplotyping approach. RESULTS: In the paper, we combine two measures "errors corrected" and "fragments cut" and propose a new optimization model, called Balanced Optimal Partition (BOP), for single individual haplotyping. The model generalizes two existing models, Minimum Error Correction (MEC) and Maximum Fragments Cut (MFC), and could be made either model by using some extreme parameter values. To solve the model, we design a heuristic dynamic programming algorithm H-BOP. By limiting the number of intermediate solutions at each iteration to an appropriately chosen small integer k, H-BOP is able to solve the model efficiently. CONCLUSIONS: Extensive experimental results on simulated and real data show that when k = 8, H-BOP is generally faster and more accurate than a recent state-of-art algorithm ReFHap in haplotype reconstruction. The running time of H-BOP is linearly dependent on some of the key parameters controlling the input size and H-BOP scales well to large input data. The code of H-BOP is available to the public for free upon request to the corresponding author. BioMed Central 2012-12-12 /pmc/articles/PMC3521186/ /pubmed/23282221 http://dx.doi.org/10.1186/1752-0509-6-S2-S8 Text en Copyright ©2012 Xie et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Xie, Minzhu
Wang, Jianxin
Jiang, Tao
A fast and accurate algorithm for single individual haplotyping
title A fast and accurate algorithm for single individual haplotyping
title_full A fast and accurate algorithm for single individual haplotyping
title_fullStr A fast and accurate algorithm for single individual haplotyping
title_full_unstemmed A fast and accurate algorithm for single individual haplotyping
title_short A fast and accurate algorithm for single individual haplotyping
title_sort fast and accurate algorithm for single individual haplotyping
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3521186/
https://www.ncbi.nlm.nih.gov/pubmed/23282221
http://dx.doi.org/10.1186/1752-0509-6-S2-S8
work_keys_str_mv AT xieminzhu afastandaccuratealgorithmforsingleindividualhaplotyping
AT wangjianxin afastandaccuratealgorithmforsingleindividualhaplotyping
AT jiangtao afastandaccuratealgorithmforsingleindividualhaplotyping
AT xieminzhu fastandaccuratealgorithmforsingleindividualhaplotyping
AT wangjianxin fastandaccuratealgorithmforsingleindividualhaplotyping
AT jiangtao fastandaccuratealgorithmforsingleindividualhaplotyping