Cargando…
Underlying causes for prevalent false positives and false negatives in STARR-seq data
Self-transcribing active regulatory region sequencing (STARR-seq) and its variants have been widely used to characterize enhancers. However, it has been reported that up to 87% of STARR-seq peaks are located in repressive chromatin and are not functional in the tested cells. While some of the STARR-...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516709/ https://www.ncbi.nlm.nih.gov/pubmed/37745976 http://dx.doi.org/10.1093/nargab/lqad085 |
_version_ | 1785109184436502528 |
---|---|
author | Ni, Pengyu Wu, Siwen Su, Zhengchang |
author_facet | Ni, Pengyu Wu, Siwen Su, Zhengchang |
author_sort | Ni, Pengyu |
collection | PubMed |
description | Self-transcribing active regulatory region sequencing (STARR-seq) and its variants have been widely used to characterize enhancers. However, it has been reported that up to 87% of STARR-seq peaks are located in repressive chromatin and are not functional in the tested cells. While some of the STARR-seq peaks in repressive chromatin might be active in other cell/tissue types, some others might be false positives. Meanwhile, many active enhancers may not be identified by the current STARR-seq methods. Although methods have been proposed to mitigate systematic errors caused by the use of plasmid vectors, the artifacts due to the intrinsic limitations of current STARR-seq methods are still prevalent and the underlying causes are not fully understood. Based on predicted cis-regulatory modules (CRMs) and non-CRMs in the human genome as well as predicted active CRMs and non-active CRMs in a few human cell lines/tissues with STARR-seq data available, we reveal prevalent false positives and false negatives in STARR-seq peaks generated by major variants of STARR-seq methods and possible underlying causes. Our results will help design strategies to improve STARR-seq methods and interpret the results. |
format | Online Article Text |
id | pubmed-10516709 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-105167092023-09-23 Underlying causes for prevalent false positives and false negatives in STARR-seq data Ni, Pengyu Wu, Siwen Su, Zhengchang NAR Genom Bioinform Standard Article Self-transcribing active regulatory region sequencing (STARR-seq) and its variants have been widely used to characterize enhancers. However, it has been reported that up to 87% of STARR-seq peaks are located in repressive chromatin and are not functional in the tested cells. While some of the STARR-seq peaks in repressive chromatin might be active in other cell/tissue types, some others might be false positives. Meanwhile, many active enhancers may not be identified by the current STARR-seq methods. Although methods have been proposed to mitigate systematic errors caused by the use of plasmid vectors, the artifacts due to the intrinsic limitations of current STARR-seq methods are still prevalent and the underlying causes are not fully understood. Based on predicted cis-regulatory modules (CRMs) and non-CRMs in the human genome as well as predicted active CRMs and non-active CRMs in a few human cell lines/tissues with STARR-seq data available, we reveal prevalent false positives and false negatives in STARR-seq peaks generated by major variants of STARR-seq methods and possible underlying causes. Our results will help design strategies to improve STARR-seq methods and interpret the results. Oxford University Press 2023-09-22 /pmc/articles/PMC10516709/ /pubmed/37745976 http://dx.doi.org/10.1093/nargab/lqad085 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Standard Article Ni, Pengyu Wu, Siwen Su, Zhengchang Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title | Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title_full | Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title_fullStr | Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title_full_unstemmed | Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title_short | Underlying causes for prevalent false positives and false negatives in STARR-seq data |
title_sort | underlying causes for prevalent false positives and false negatives in starr-seq data |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516709/ https://www.ncbi.nlm.nih.gov/pubmed/37745976 http://dx.doi.org/10.1093/nargab/lqad085 |
work_keys_str_mv | AT nipengyu underlyingcausesforprevalentfalsepositivesandfalsenegativesinstarrseqdata AT wusiwen underlyingcausesforprevalentfalsepositivesandfalsenegativesinstarrseqdata AT suzhengchang underlyingcausesforprevalentfalsepositivesandfalsenegativesinstarrseqdata |