Cargando…

Maximal Perfect Haplotype Blocks with Wildcards

Recent work provides the first method to measure the relative fitness of genomic variants within a population that scales to large numbers of genomes. A key component of the computation involves finding maximal perfect haplotype blocks from a set of genomic samples for which SNPs (single-nucleotide...

Descripción completa

Detalles Bibliográficos
Autores principales: Williams, Lucia, Mumey, Brendan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243190/
https://www.ncbi.nlm.nih.gov/pubmed/32446220
http://dx.doi.org/10.1016/j.isci.2020.101149
_version_ 1783537380872945664
author Williams, Lucia
Mumey, Brendan
author_facet Williams, Lucia
Mumey, Brendan
author_sort Williams, Lucia
collection PubMed
description Recent work provides the first method to measure the relative fitness of genomic variants within a population that scales to large numbers of genomes. A key component of the computation involves finding maximal perfect haplotype blocks from a set of genomic samples for which SNPs (single-nucleotide polymorphisms) have been called. Often, owing to low read coverage and imperfect assemblies, some of the SNP calls can be missing from some of the samples. In this work, we consider the problem of finding maximal perfect haplotype blocks where some missing values may be present. Missing values are treated as wildcards, and the definition of maximal perfect haplotype blocks is extended in a natural way. We provide an output-linear time algorithm to identify all such blocks and demonstrate the algorithm on a large population SNP dataset. Our software is publicly available.
format Online
Article
Text
id pubmed-7243190
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-72431902020-05-26 Maximal Perfect Haplotype Blocks with Wildcards Williams, Lucia Mumey, Brendan iScience Article Recent work provides the first method to measure the relative fitness of genomic variants within a population that scales to large numbers of genomes. A key component of the computation involves finding maximal perfect haplotype blocks from a set of genomic samples for which SNPs (single-nucleotide polymorphisms) have been called. Often, owing to low read coverage and imperfect assemblies, some of the SNP calls can be missing from some of the samples. In this work, we consider the problem of finding maximal perfect haplotype blocks where some missing values may be present. Missing values are treated as wildcards, and the definition of maximal perfect haplotype blocks is extended in a natural way. We provide an output-linear time algorithm to identify all such blocks and demonstrate the algorithm on a large population SNP dataset. Our software is publicly available. Elsevier 2020-05-11 /pmc/articles/PMC7243190/ /pubmed/32446220 http://dx.doi.org/10.1016/j.isci.2020.101149 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Williams, Lucia
Mumey, Brendan
Maximal Perfect Haplotype Blocks with Wildcards
title Maximal Perfect Haplotype Blocks with Wildcards
title_full Maximal Perfect Haplotype Blocks with Wildcards
title_fullStr Maximal Perfect Haplotype Blocks with Wildcards
title_full_unstemmed Maximal Perfect Haplotype Blocks with Wildcards
title_short Maximal Perfect Haplotype Blocks with Wildcards
title_sort maximal perfect haplotype blocks with wildcards
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243190/
https://www.ncbi.nlm.nih.gov/pubmed/32446220
http://dx.doi.org/10.1016/j.isci.2020.101149
work_keys_str_mv AT williamslucia maximalperfecthaplotypeblockswithwildcards
AT mumeybrendan maximalperfecthaplotypeblockswithwildcards