Cargando…

Modelling haplotypes with respect to reference cohort variation graphs

MOTIVATION: Current statistical models of haplotypes are limited to panels of haplotypes whose genetic variation can be represented by arrays of values at linearly ordered bi- or multiallelic loci. These methods cannot model structural variants or variants that nest or overlap. RESULTS: A variation...

Descripción completa

Detalles Bibliográficos
Autores principales: Rosen, Yohei, Eizenga, Jordan, Paten, Benedict
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870562/
https://www.ncbi.nlm.nih.gov/pubmed/28881971
http://dx.doi.org/10.1093/bioinformatics/btx236
_version_ 1783309510630178816
author Rosen, Yohei
Eizenga, Jordan
Paten, Benedict
author_facet Rosen, Yohei
Eizenga, Jordan
Paten, Benedict
author_sort Rosen, Yohei
collection PubMed
description MOTIVATION: Current statistical models of haplotypes are limited to panels of haplotypes whose genetic variation can be represented by arrays of values at linearly ordered bi- or multiallelic loci. These methods cannot model structural variants or variants that nest or overlap. RESULTS: A variation graph is a mathematical structure that can encode arbitrarily complex genetic variation. We present the first haplotype model that operates on a variation graph-embedded population reference cohort. We describe an algorithm to calculate the likelihood that a haplotype arose from this cohort through recombinations and demonstrate time complexity linear in haplotype length and sublinear in population size. We furthermore demonstrate a method of rapidly calculating likelihoods for related haplotypes. We describe mathematical extensions to allow modelling of mutations. This work is an important incremental step for clinical genomics and genetic epidemiology since it is the first haplotype model which can represent all sorts of variation in the population. AVAILABILITY AND IMPLEMENTATION: Available on GitHub at https://github.com/yoheirosen/vg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870562
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58705622018-04-05 Modelling haplotypes with respect to reference cohort variation graphs Rosen, Yohei Eizenga, Jordan Paten, Benedict Bioinformatics Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017 MOTIVATION: Current statistical models of haplotypes are limited to panels of haplotypes whose genetic variation can be represented by arrays of values at linearly ordered bi- or multiallelic loci. These methods cannot model structural variants or variants that nest or overlap. RESULTS: A variation graph is a mathematical structure that can encode arbitrarily complex genetic variation. We present the first haplotype model that operates on a variation graph-embedded population reference cohort. We describe an algorithm to calculate the likelihood that a haplotype arose from this cohort through recombinations and demonstrate time complexity linear in haplotype length and sublinear in population size. We furthermore demonstrate a method of rapidly calculating likelihoods for related haplotypes. We describe mathematical extensions to allow modelling of mutations. This work is an important incremental step for clinical genomics and genetic epidemiology since it is the first haplotype model which can represent all sorts of variation in the population. AVAILABILITY AND IMPLEMENTATION: Available on GitHub at https://github.com/yoheirosen/vg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-07-15 2017-07-12 /pmc/articles/PMC5870562/ /pubmed/28881971 http://dx.doi.org/10.1093/bioinformatics/btx236 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
Rosen, Yohei
Eizenga, Jordan
Paten, Benedict
Modelling haplotypes with respect to reference cohort variation graphs
title Modelling haplotypes with respect to reference cohort variation graphs
title_full Modelling haplotypes with respect to reference cohort variation graphs
title_fullStr Modelling haplotypes with respect to reference cohort variation graphs
title_full_unstemmed Modelling haplotypes with respect to reference cohort variation graphs
title_short Modelling haplotypes with respect to reference cohort variation graphs
title_sort modelling haplotypes with respect to reference cohort variation graphs
topic Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870562/
https://www.ncbi.nlm.nih.gov/pubmed/28881971
http://dx.doi.org/10.1093/bioinformatics/btx236
work_keys_str_mv AT rosenyohei modellinghaplotypeswithrespecttoreferencecohortvariationgraphs
AT eizengajordan modellinghaplotypeswithrespecttoreferencecohortvariationgraphs
AT patenbenedict modellinghaplotypeswithrespecttoreferencecohortvariationgraphs