Cargando…

OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data

Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jiayin, Chen, Liubin, Zhang, Xuanping, Tong, Yao, Zheng, Tian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8198695/
https://www.ncbi.nlm.nih.gov/pubmed/34071577
http://dx.doi.org/10.3390/ijms22115802
_version_ 1783707200561086464
author Wang, Jiayin
Chen, Liubin
Zhang, Xuanping
Tong, Yao
Zheng, Tian
author_facet Wang, Jiayin
Chen, Liubin
Zhang, Xuanping
Tong, Yao
Zheng, Tian
author_sort Wang, Jiayin
collection PubMed
description Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%.
format Online
Article
Text
id pubmed-8198695
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81986952021-06-14 OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data Wang, Jiayin Chen, Liubin Zhang, Xuanping Tong, Yao Zheng, Tian Int J Mol Sci Article Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%. MDPI 2021-05-28 /pmc/articles/PMC8198695/ /pubmed/34071577 http://dx.doi.org/10.3390/ijms22115802 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Jiayin
Chen, Liubin
Zhang, Xuanping
Tong, Yao
Zheng, Tian
OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title_full OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title_fullStr OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title_full_unstemmed OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title_short OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
title_sort ocrdetector: accurately detecting open chromatin regions via plasma cell-free dna sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8198695/
https://www.ncbi.nlm.nih.gov/pubmed/34071577
http://dx.doi.org/10.3390/ijms22115802
work_keys_str_mv AT wangjiayin ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata
AT chenliubin ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata
AT zhangxuanping ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata
AT tongyao ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata
AT zhengtian ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata