Cargando…

CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes

Chromatin accessibility is a highly informative structural feature for understanding gene transcription regulation, because it indicates the degree to which nuclear macromolecules such as proteins and RNAs can access chromosomal DNA. Studies have shown that chromatin accessibility is highly dynamic...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, Yin, Chen, Ling-Ling, Gao, Junxiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9170768/
https://www.ncbi.nlm.nih.gov/pubmed/33662624
http://dx.doi.org/10.1016/j.gpb.2020.06.021
_version_ 1784721507408150528
author Shen, Yin
Chen, Ling-Ling
Gao, Junxiang
author_facet Shen, Yin
Chen, Ling-Ling
Gao, Junxiang
author_sort Shen, Yin
collection PubMed
description Chromatin accessibility is a highly informative structural feature for understanding gene transcription regulation, because it indicates the degree to which nuclear macromolecules such as proteins and RNAs can access chromosomal DNA. Studies have shown that chromatin accessibility is highly dynamic during stress response, stimulus response, and developmental transition. Moreover, physical access to chromosomal DNA in eukaryotes is highly cell-specific. Therefore, current technologies such as DNase-seq, ATAC-seq, and FAIRE-seq reveal only a portion of the open chromatin regions (OCRs) present in a given species. Thus, the genome-wide distribution of OCRs remains unknown. In this study, we developed a bioinformatics tool called CharPlant for the de novo prediction of OCRs in plant genomes. To develop this tool, we constructed a three-layer convolutional neural network (CNN) and subsequently trained the CNN using DNase-seq and ATAC-seq datasets of four plant species. The model simultaneously learns the sequence motifs and regulatory logics, which are jointly used to determine DNA accessibility. All of these steps are integrated into CharPlant, which can be run using a simple command line. The results of data analysis using CharPlant in this study demonstrate its prediction power and computational efficiency. To our knowledge, CharPlant is the first de novo prediction tool that can identify potential OCRs in the whole genome. The source code of CharPlant and supporting files are freely available from https://github.com/Yin-Shen/CharPlant.
format Online
Article
Text
id pubmed-9170768
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-91707682022-06-08 CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes Shen, Yin Chen, Ling-Ling Gao, Junxiang Genomics Proteomics Bioinformatics Application Note Chromatin accessibility is a highly informative structural feature for understanding gene transcription regulation, because it indicates the degree to which nuclear macromolecules such as proteins and RNAs can access chromosomal DNA. Studies have shown that chromatin accessibility is highly dynamic during stress response, stimulus response, and developmental transition. Moreover, physical access to chromosomal DNA in eukaryotes is highly cell-specific. Therefore, current technologies such as DNase-seq, ATAC-seq, and FAIRE-seq reveal only a portion of the open chromatin regions (OCRs) present in a given species. Thus, the genome-wide distribution of OCRs remains unknown. In this study, we developed a bioinformatics tool called CharPlant for the de novo prediction of OCRs in plant genomes. To develop this tool, we constructed a three-layer convolutional neural network (CNN) and subsequently trained the CNN using DNase-seq and ATAC-seq datasets of four plant species. The model simultaneously learns the sequence motifs and regulatory logics, which are jointly used to determine DNA accessibility. All of these steps are integrated into CharPlant, which can be run using a simple command line. The results of data analysis using CharPlant in this study demonstrate its prediction power and computational efficiency. To our knowledge, CharPlant is the first de novo prediction tool that can identify potential OCRs in the whole genome. The source code of CharPlant and supporting files are freely available from https://github.com/Yin-Shen/CharPlant. Elsevier 2021-10 2021-03-02 /pmc/articles/PMC9170768/ /pubmed/33662624 http://dx.doi.org/10.1016/j.gpb.2020.06.021 Text en © 2021 The Author https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Application Note
Shen, Yin
Chen, Ling-Ling
Gao, Junxiang
CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title_full CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title_fullStr CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title_full_unstemmed CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title_short CharPlant: A De Novo Open Chromatin Region Prediction Tool for Plant Genomes
title_sort charplant: a de novo open chromatin region prediction tool for plant genomes
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9170768/
https://www.ncbi.nlm.nih.gov/pubmed/33662624
http://dx.doi.org/10.1016/j.gpb.2020.06.021
work_keys_str_mv AT shenyin charplantadenovoopenchromatinregionpredictiontoolforplantgenomes
AT chenlingling charplantadenovoopenchromatinregionpredictiontoolforplantgenomes
AT gaojunxiang charplantadenovoopenchromatinregionpredictiontoolforplantgenomes