Cargando…

Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis

There are many problems in biology and related disciplines involving stochasticity, where a signal can only be detected when it lies above a threshold level, while signals lying below threshold are simply not detected. A consequence is that the detected signal is conditioned to lie above threshold,...

Descripción completa

Detalles Bibliográficos
Autores principales: Gossmann, Toni I., Waxman, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016752/
https://www.ncbi.nlm.nih.gov/pubmed/35349695
http://dx.doi.org/10.1093/gbe/evac047
_version_ 1784688594619727872
author Gossmann, Toni I.
Waxman, David
author_facet Gossmann, Toni I.
Waxman, David
author_sort Gossmann, Toni I.
collection PubMed
description There are many problems in biology and related disciplines involving stochasticity, where a signal can only be detected when it lies above a threshold level, while signals lying below threshold are simply not detected. A consequence is that the detected signal is conditioned to lie above threshold, and is not representative of the actual signal. In this work, we present some general results for the conditioning that occurs due to the existence of such an observational threshold. We show that this conditioning is relevant, for example, to gene-frequency trajectories, where many loci in the genome are simultaneously measured in a given generation. Such a threshold can lead to severe biases of allele frequency estimates under purifying selection. In the analysis presented, within the context of Markov chains such as the Wright–Fisher model, we address two key questions: (1) “What is a natural measure of the strength of the conditioning associated with an observation threshold?” (2) “What is a principled way to correct for the effects of the conditioning?”. We answer the first question in terms of a proportion. Starting with a large number of trajectories, the relevant quantity is the proportion of these trajectories that are above threshold at a later time and hence are detected. The smaller the value of this proportion, the stronger the effects of conditioning. We provide an approximate analytical answer to the second question, that corrects the bias produced by an observation threshold, and performs to reasonable accuracy in the Wright–Fisher model for biologically plausible parameter values.
format Online
Article
Text
id pubmed-9016752
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-90167522022-04-20 Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis Gossmann, Toni I. Waxman, David Genome Biol Evol Research Article There are many problems in biology and related disciplines involving stochasticity, where a signal can only be detected when it lies above a threshold level, while signals lying below threshold are simply not detected. A consequence is that the detected signal is conditioned to lie above threshold, and is not representative of the actual signal. In this work, we present some general results for the conditioning that occurs due to the existence of such an observational threshold. We show that this conditioning is relevant, for example, to gene-frequency trajectories, where many loci in the genome are simultaneously measured in a given generation. Such a threshold can lead to severe biases of allele frequency estimates under purifying selection. In the analysis presented, within the context of Markov chains such as the Wright–Fisher model, we address two key questions: (1) “What is a natural measure of the strength of the conditioning associated with an observation threshold?” (2) “What is a principled way to correct for the effects of the conditioning?”. We answer the first question in terms of a proportion. Starting with a large number of trajectories, the relevant quantity is the proportion of these trajectories that are above threshold at a later time and hence are detected. The smaller the value of this proportion, the stronger the effects of conditioning. We provide an approximate analytical answer to the second question, that corrects the bias produced by an observation threshold, and performs to reasonable accuracy in the Wright–Fisher model for biologically plausible parameter values. Oxford University Press 2022-03-29 /pmc/articles/PMC9016752/ /pubmed/35349695 http://dx.doi.org/10.1093/gbe/evac047 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research Article
Gossmann, Toni I.
Waxman, David
Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title_full Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title_fullStr Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title_full_unstemmed Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title_short Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis
title_sort correcting bias in allele frequency estimates due to an observation threshold: a markov chain analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9016752/
https://www.ncbi.nlm.nih.gov/pubmed/35349695
http://dx.doi.org/10.1093/gbe/evac047
work_keys_str_mv AT gossmanntonii correctingbiasinallelefrequencyestimatesduetoanobservationthresholdamarkovchainanalysis
AT waxmandavid correctingbiasinallelefrequencyestimatesduetoanobservationthresholdamarkovchainanalysis