Cargando…
OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data
Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8198695/ https://www.ncbi.nlm.nih.gov/pubmed/34071577 http://dx.doi.org/10.3390/ijms22115802 |
_version_ | 1783707200561086464 |
---|---|
author | Wang, Jiayin Chen, Liubin Zhang, Xuanping Tong, Yao Zheng, Tian |
author_facet | Wang, Jiayin Chen, Liubin Zhang, Xuanping Tong, Yao Zheng, Tian |
author_sort | Wang, Jiayin |
collection | PubMed |
description | Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%. |
format | Online Article Text |
id | pubmed-8198695 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-81986952021-06-14 OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data Wang, Jiayin Chen, Liubin Zhang, Xuanping Tong, Yao Zheng, Tian Int J Mol Sci Article Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%. MDPI 2021-05-28 /pmc/articles/PMC8198695/ /pubmed/34071577 http://dx.doi.org/10.3390/ijms22115802 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Jiayin Chen, Liubin Zhang, Xuanping Tong, Yao Zheng, Tian OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title | OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title_full | OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title_fullStr | OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title_full_unstemmed | OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title_short | OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data |
title_sort | ocrdetector: accurately detecting open chromatin regions via plasma cell-free dna sequencing data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8198695/ https://www.ncbi.nlm.nih.gov/pubmed/34071577 http://dx.doi.org/10.3390/ijms22115802 |
work_keys_str_mv | AT wangjiayin ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata AT chenliubin ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata AT zhangxuanping ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata AT tongyao ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata AT zhengtian ocrdetectoraccuratelydetectingopenchromatinregionsviaplasmacellfreednasequencingdata |