Cargando…

Ascertainment bias in the genomic test of positive selection on regulatory sequences

Evolution of gene expression mediated by cis-regulatory changes is thought to be an important contributor to organismal adaptation, but identifying adaptive cis-regulatory changes is challenging due to the difficulty in knowing the expectation under no positive selection. A new approach for detectin...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Daohan, Zhang, Jianzhi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473660/
https://www.ncbi.nlm.nih.gov/pubmed/37662307
http://dx.doi.org/10.1101/2023.08.20.554030
_version_ 1785100317152509952
author Jiang, Daohan
Zhang, Jianzhi
author_facet Jiang, Daohan
Zhang, Jianzhi
author_sort Jiang, Daohan
collection PubMed
description Evolution of gene expression mediated by cis-regulatory changes is thought to be an important contributor to organismal adaptation, but identifying adaptive cis-regulatory changes is challenging due to the difficulty in knowing the expectation under no positive selection. A new approach for detecting positive selection on transcription factor binding sites (TFBSs) was recently developed, thanks to the application of machine learning in predicting transcription factor (TF) binding affinities of DNA sequences. Given a TFBS sequence from a focal species and the corresponding inferred ancestral sequence that differs from the former at n sites, one can predict the TF binding affinities of many n-step mutational neighbors of the ancestral sequence and obtain a null distribution of the derived binding affinity, which allows testing whether the binding affinity of the real derived sequence deviates significantly from the null distribution. Applying this test genomically to all experimentally identified binding sites of three TFs in humans, a recent study reported positive selection for elevated binding affinities of TFBSs. Here we show that this genomic test suffers from an ascertainment bias because, even in the absence of positive selection for strengthened binding, the binding affinities of known human TFBSs are more likely to have increased than decreased in evolution. We demonstrate by computer simulation that this bias inflates the false positive rate of the selection test. We propose several methods to mitigate the ascertainment bias and show that almost all previously reported positive selection signals disappear when these methods are applied.
format Online
Article
Text
id pubmed-10473660
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104736602023-09-02 Ascertainment bias in the genomic test of positive selection on regulatory sequences Jiang, Daohan Zhang, Jianzhi bioRxiv Article Evolution of gene expression mediated by cis-regulatory changes is thought to be an important contributor to organismal adaptation, but identifying adaptive cis-regulatory changes is challenging due to the difficulty in knowing the expectation under no positive selection. A new approach for detecting positive selection on transcription factor binding sites (TFBSs) was recently developed, thanks to the application of machine learning in predicting transcription factor (TF) binding affinities of DNA sequences. Given a TFBS sequence from a focal species and the corresponding inferred ancestral sequence that differs from the former at n sites, one can predict the TF binding affinities of many n-step mutational neighbors of the ancestral sequence and obtain a null distribution of the derived binding affinity, which allows testing whether the binding affinity of the real derived sequence deviates significantly from the null distribution. Applying this test genomically to all experimentally identified binding sites of three TFs in humans, a recent study reported positive selection for elevated binding affinities of TFBSs. Here we show that this genomic test suffers from an ascertainment bias because, even in the absence of positive selection for strengthened binding, the binding affinities of known human TFBSs are more likely to have increased than decreased in evolution. We demonstrate by computer simulation that this bias inflates the false positive rate of the selection test. We propose several methods to mitigate the ascertainment bias and show that almost all previously reported positive selection signals disappear when these methods are applied. Cold Spring Harbor Laboratory 2023-08-21 /pmc/articles/PMC10473660/ /pubmed/37662307 http://dx.doi.org/10.1101/2023.08.20.554030 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Jiang, Daohan
Zhang, Jianzhi
Ascertainment bias in the genomic test of positive selection on regulatory sequences
title Ascertainment bias in the genomic test of positive selection on regulatory sequences
title_full Ascertainment bias in the genomic test of positive selection on regulatory sequences
title_fullStr Ascertainment bias in the genomic test of positive selection on regulatory sequences
title_full_unstemmed Ascertainment bias in the genomic test of positive selection on regulatory sequences
title_short Ascertainment bias in the genomic test of positive selection on regulatory sequences
title_sort ascertainment bias in the genomic test of positive selection on regulatory sequences
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10473660/
https://www.ncbi.nlm.nih.gov/pubmed/37662307
http://dx.doi.org/10.1101/2023.08.20.554030
work_keys_str_mv AT jiangdaohan ascertainmentbiasinthegenomictestofpositiveselectiononregulatorysequences
AT zhangjianzhi ascertainmentbiasinthegenomictestofpositiveselectiononregulatorysequences