Cargando…

HaploPOP: a software that improves population assignment by combining markers into haplotypes

BACKGROUND: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Duforet-Frebourg, Nicolas, Gattepaille, Lucie M., Blum, Michael G.B, Jakobsson, Mattias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521458/
https://www.ncbi.nlm.nih.gov/pubmed/26227424
http://dx.doi.org/10.1186/s12859-015-0661-6
_version_ 1782383812830822400
author Duforet-Frebourg, Nicolas
Gattepaille, Lucie M.
Blum, Michael G.B
Jakobsson, Mattias
author_facet Duforet-Frebourg, Nicolas
Gattepaille, Lucie M.
Blum, Michael G.B
Jakobsson, Mattias
author_sort Duforet-Frebourg, Nicolas
collection PubMed
description BACKGROUND: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD. RESULTS: To improve the accuracy of population assignment, we present an algorithm, implemented in the HaploPOP software, that combines markers into haplotypes, without requiring independence. The algorithm is based on the Gain of Informativeness for Assignment that provides a measure to decide if a pair of markers should be combined into haplotypes, or not, in order to improve assignment. Because complete exploration of all possible solutions for constructing haplotypes is computationally prohibitive, our approach uses a greedy algorithm based on windows of fixed sizes. We evaluate the performance of HaploPOP to assign individuals to populations using a split-validation approach. We investigate both simulated SNPs data and dense genotype data from individuals from Spain and Portugal. CONCLUSIONS: Our results show that constructing haplotypes with HaploPOP can substantially reduce assignment error. The HaploPOP software is freely available as a command-line software at www.ieg.uu.se/Jakobsson/software/HaploPOP/.
format Online
Article
Text
id pubmed-4521458
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45214582015-08-01 HaploPOP: a software that improves population assignment by combining markers into haplotypes Duforet-Frebourg, Nicolas Gattepaille, Lucie M. Blum, Michael G.B Jakobsson, Mattias BMC Bioinformatics Research Article BACKGROUND: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD. RESULTS: To improve the accuracy of population assignment, we present an algorithm, implemented in the HaploPOP software, that combines markers into haplotypes, without requiring independence. The algorithm is based on the Gain of Informativeness for Assignment that provides a measure to decide if a pair of markers should be combined into haplotypes, or not, in order to improve assignment. Because complete exploration of all possible solutions for constructing haplotypes is computationally prohibitive, our approach uses a greedy algorithm based on windows of fixed sizes. We evaluate the performance of HaploPOP to assign individuals to populations using a split-validation approach. We investigate both simulated SNPs data and dense genotype data from individuals from Spain and Portugal. CONCLUSIONS: Our results show that constructing haplotypes with HaploPOP can substantially reduce assignment error. The HaploPOP software is freely available as a command-line software at www.ieg.uu.se/Jakobsson/software/HaploPOP/. BioMed Central 2015-07-31 /pmc/articles/PMC4521458/ /pubmed/26227424 http://dx.doi.org/10.1186/s12859-015-0661-6 Text en © Duforet-Frebourg et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Duforet-Frebourg, Nicolas
Gattepaille, Lucie M.
Blum, Michael G.B
Jakobsson, Mattias
HaploPOP: a software that improves population assignment by combining markers into haplotypes
title HaploPOP: a software that improves population assignment by combining markers into haplotypes
title_full HaploPOP: a software that improves population assignment by combining markers into haplotypes
title_fullStr HaploPOP: a software that improves population assignment by combining markers into haplotypes
title_full_unstemmed HaploPOP: a software that improves population assignment by combining markers into haplotypes
title_short HaploPOP: a software that improves population assignment by combining markers into haplotypes
title_sort haplopop: a software that improves population assignment by combining markers into haplotypes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4521458/
https://www.ncbi.nlm.nih.gov/pubmed/26227424
http://dx.doi.org/10.1186/s12859-015-0661-6
work_keys_str_mv AT duforetfrebourgnicolas haplopopasoftwarethatimprovespopulationassignmentbycombiningmarkersintohaplotypes
AT gattepailleluciem haplopopasoftwarethatimprovespopulationassignmentbycombiningmarkersintohaplotypes
AT blummichaelgb haplopopasoftwarethatimprovespopulationassignmentbycombiningmarkersintohaplotypes
AT jakobssonmattias haplopopasoftwarethatimprovespopulationassignmentbycombiningmarkersintohaplotypes