Cargando…

Haplotype matching in large cohorts using the Li and Stephens model

MOTIVATION: The Li and Stephens model, which approximates the coalescent describing the pattern of variation in a population, underpins a range of key tools and results in genetics. Although highly efficient compared to the coalescent, standard implementations of this model still cannot deal with th...

Descripción completa

Detalles Bibliográficos
Autor principal: Lunter, Gerton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6394399/
https://www.ncbi.nlm.nih.gov/pubmed/30165547
http://dx.doi.org/10.1093/bioinformatics/bty735
_version_ 1783398889392439296
author Lunter, Gerton
author_facet Lunter, Gerton
author_sort Lunter, Gerton
collection PubMed
description MOTIVATION: The Li and Stephens model, which approximates the coalescent describing the pattern of variation in a population, underpins a range of key tools and results in genetics. Although highly efficient compared to the coalescent, standard implementations of this model still cannot deal with the very large reference cohorts that are starting to become available, and practical implementations use heuristics to achieve reasonable runtimes. RESULTS: Here I describe a new, exact algorithm (‘fastLS’) that implements the Li and Stephens model and achieves runtimes independent of the size of the reference cohort. Key to achieving this runtime is the use of the Burrows-Wheeler transform, allowing the algorithm to efficiently identify partial haplotype matches across a cohort. I show that the proposed data structure is very similar to, and generalizes, Durbin’s positional Burrows-Wheeler transform.
format Online
Article
Text
id pubmed-6394399
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63943992019-03-05 Haplotype matching in large cohorts using the Li and Stephens model Lunter, Gerton Bioinformatics Original Papers MOTIVATION: The Li and Stephens model, which approximates the coalescent describing the pattern of variation in a population, underpins a range of key tools and results in genetics. Although highly efficient compared to the coalescent, standard implementations of this model still cannot deal with the very large reference cohorts that are starting to become available, and practical implementations use heuristics to achieve reasonable runtimes. RESULTS: Here I describe a new, exact algorithm (‘fastLS’) that implements the Li and Stephens model and achieves runtimes independent of the size of the reference cohort. Key to achieving this runtime is the use of the Burrows-Wheeler transform, allowing the algorithm to efficiently identify partial haplotype matches across a cohort. I show that the proposed data structure is very similar to, and generalizes, Durbin’s positional Burrows-Wheeler transform. Oxford University Press 2019-03-01 2018-08-25 /pmc/articles/PMC6394399/ /pubmed/30165547 http://dx.doi.org/10.1093/bioinformatics/bty735 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Lunter, Gerton
Haplotype matching in large cohorts using the Li and Stephens model
title Haplotype matching in large cohorts using the Li and Stephens model
title_full Haplotype matching in large cohorts using the Li and Stephens model
title_fullStr Haplotype matching in large cohorts using the Li and Stephens model
title_full_unstemmed Haplotype matching in large cohorts using the Li and Stephens model
title_short Haplotype matching in large cohorts using the Li and Stephens model
title_sort haplotype matching in large cohorts using the li and stephens model
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6394399/
https://www.ncbi.nlm.nih.gov/pubmed/30165547
http://dx.doi.org/10.1093/bioinformatics/bty735
work_keys_str_mv AT luntergerton haplotypematchinginlargecohortsusingtheliandstephensmodel