Cargando…

An average-case sublinear forward algorithm for the haploid Li and Stephens model

BACKGROUND: Hidden Markov models of haplotype inheritance such as the Li and Stephens model allow for computationally tractable probability calculations using the forward algorithm as long as the representative reference panel used in the model is sufficiently small. Specifically, the monoploid Li a...

Descripción completa

Detalles Bibliográficos
Autores principales: Rosen, Yohei M., Paten, Benedict J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446408/
https://www.ncbi.nlm.nih.gov/pubmed/30988694
http://dx.doi.org/10.1186/s13015-019-0144-9
_version_ 1783408360373092352
author Rosen, Yohei M.
Paten, Benedict J.
author_facet Rosen, Yohei M.
Paten, Benedict J.
author_sort Rosen, Yohei M.
collection PubMed
description BACKGROUND: Hidden Markov models of haplotype inheritance such as the Li and Stephens model allow for computationally tractable probability calculations using the forward algorithm as long as the representative reference panel used in the model is sufficiently small. Specifically, the monoploid Li and Stephens model and its variants are linear in reference panel size unless heuristic approximations are used. However, sequencing projects numbering in the thousands to hundreds of thousands of individuals are underway, and others numbering in the millions are anticipated. RESULTS: To make the forward algorithm for the haploid Li and Stephens model computationally tractable for these datasets, we have created a numerically exact version of the algorithm with observed average case sublinear runtime with respect to reference panel size k when tested against the 1000 Genomes dataset. CONCLUSIONS: We show a forward algorithm which avoids any tradeoff between runtime and model complexity. Our algorithm makes use of two general strategies which might be applicable to improving the time complexity of other future sequence analysis algorithms: sparse dynamic programming matrices and lazy evaluation.
format Online
Article
Text
id pubmed-6446408
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64464082019-04-15 An average-case sublinear forward algorithm for the haploid Li and Stephens model Rosen, Yohei M. Paten, Benedict J. Algorithms Mol Biol Research BACKGROUND: Hidden Markov models of haplotype inheritance such as the Li and Stephens model allow for computationally tractable probability calculations using the forward algorithm as long as the representative reference panel used in the model is sufficiently small. Specifically, the monoploid Li and Stephens model and its variants are linear in reference panel size unless heuristic approximations are used. However, sequencing projects numbering in the thousands to hundreds of thousands of individuals are underway, and others numbering in the millions are anticipated. RESULTS: To make the forward algorithm for the haploid Li and Stephens model computationally tractable for these datasets, we have created a numerically exact version of the algorithm with observed average case sublinear runtime with respect to reference panel size k when tested against the 1000 Genomes dataset. CONCLUSIONS: We show a forward algorithm which avoids any tradeoff between runtime and model complexity. Our algorithm makes use of two general strategies which might be applicable to improving the time complexity of other future sequence analysis algorithms: sparse dynamic programming matrices and lazy evaluation. BioMed Central 2019-04-02 /pmc/articles/PMC6446408/ /pubmed/30988694 http://dx.doi.org/10.1186/s13015-019-0144-9 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Rosen, Yohei M.
Paten, Benedict J.
An average-case sublinear forward algorithm for the haploid Li and Stephens model
title An average-case sublinear forward algorithm for the haploid Li and Stephens model
title_full An average-case sublinear forward algorithm for the haploid Li and Stephens model
title_fullStr An average-case sublinear forward algorithm for the haploid Li and Stephens model
title_full_unstemmed An average-case sublinear forward algorithm for the haploid Li and Stephens model
title_short An average-case sublinear forward algorithm for the haploid Li and Stephens model
title_sort average-case sublinear forward algorithm for the haploid li and stephens model
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446408/
https://www.ncbi.nlm.nih.gov/pubmed/30988694
http://dx.doi.org/10.1186/s13015-019-0144-9
work_keys_str_mv AT rosenyoheim anaveragecasesublinearforwardalgorithmforthehaploidliandstephensmodel
AT patenbenedictj anaveragecasesublinearforwardalgorithmforthehaploidliandstephensmodel
AT rosenyoheim averagecasesublinearforwardalgorithmforthehaploidliandstephensmodel
AT patenbenedictj averagecasesublinearforwardalgorithmforthehaploidliandstephensmodel