Cargando…

A fast and accurate enumeration-based algorithm for haplotyping a triploid individual

BACKGROUND: Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Jingli, Zhang, Qian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5984336/
https://www.ncbi.nlm.nih.gov/pubmed/29881444
http://dx.doi.org/10.1186/s13015-018-0129-0
_version_ 1783328593282072576
author Wu, Jingli
Zhang, Qian
author_facet Wu, Jingli
Zhang, Qian
author_sort Wu, Jingli
collection PubMed
description BACKGROUND: Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous chromosomes have attracted many research groups that are interested in the genomics of disease, phylogenetics, botany and evolution. However, there is still a lack of methods for reconstructing polyploid haplotypes. RESULTS: In this work, the minimum error correction with genotype information (MEC/GI) model, an important combinatorial model for haplotyping a single individual, is used to study the triploid individual haplotype reconstruction problem. A fast and accurate enumeration-based algorithm enumeration haplotyping triploid with least difference (EHTLD) is proposed for solving the MEC/GI model. The EHTLD algorithm tries to reconstruct the three haplotypes according to the order of single nucleotide polymorphism (SNP) loci along them. When reconstructing a given SNP site, the EHTLD algorithm enumerates three kinds of SNP values in terms of the corresponding site’s genotype value, and chooses the one, which leads to the minimum difference between the reconstructed haplotypes and the sequenced fragments covering that SNP site, to fill the SNP loci being reconstructed. CONCLUSION: Extensive experimental comparisons were performed between the EHTLD algorithm and the well known HapCompass and HapTree. Compared with algorithms HapCompass and HapTree, the EHTLD algorithm can reconstruct more accurate haplotypes, which were proven by a number of experiments.
format Online
Article
Text
id pubmed-5984336
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59843362018-06-07 A fast and accurate enumeration-based algorithm for haplotyping a triploid individual Wu, Jingli Zhang, Qian Algorithms Mol Biol Research BACKGROUND: Haplotype assembly, reconstructing haplotypes from sequence data, is one of the major computational problems in bioinformatics. Most of the current methodologies for haplotype assembly are designed for diploid individuals. In recent years, genomes having more than two sets of homologous chromosomes have attracted many research groups that are interested in the genomics of disease, phylogenetics, botany and evolution. However, there is still a lack of methods for reconstructing polyploid haplotypes. RESULTS: In this work, the minimum error correction with genotype information (MEC/GI) model, an important combinatorial model for haplotyping a single individual, is used to study the triploid individual haplotype reconstruction problem. A fast and accurate enumeration-based algorithm enumeration haplotyping triploid with least difference (EHTLD) is proposed for solving the MEC/GI model. The EHTLD algorithm tries to reconstruct the three haplotypes according to the order of single nucleotide polymorphism (SNP) loci along them. When reconstructing a given SNP site, the EHTLD algorithm enumerates three kinds of SNP values in terms of the corresponding site’s genotype value, and chooses the one, which leads to the minimum difference between the reconstructed haplotypes and the sequenced fragments covering that SNP site, to fill the SNP loci being reconstructed. CONCLUSION: Extensive experimental comparisons were performed between the EHTLD algorithm and the well known HapCompass and HapTree. Compared with algorithms HapCompass and HapTree, the EHTLD algorithm can reconstruct more accurate haplotypes, which were proven by a number of experiments. BioMed Central 2018-06-01 /pmc/articles/PMC5984336/ /pubmed/29881444 http://dx.doi.org/10.1186/s13015-018-0129-0 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Jingli
Zhang, Qian
A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title_full A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title_fullStr A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title_full_unstemmed A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title_short A fast and accurate enumeration-based algorithm for haplotyping a triploid individual
title_sort fast and accurate enumeration-based algorithm for haplotyping a triploid individual
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5984336/
https://www.ncbi.nlm.nih.gov/pubmed/29881444
http://dx.doi.org/10.1186/s13015-018-0129-0
work_keys_str_mv AT wujingli afastandaccurateenumerationbasedalgorithmforhaplotypingatriploidindividual
AT zhangqian afastandaccurateenumerationbasedalgorithmforhaplotypingatriploidindividual
AT wujingli fastandaccurateenumerationbasedalgorithmforhaplotypingatriploidindividual
AT zhangqian fastandaccurateenumerationbasedalgorithmforhaplotypingatriploidindividual