Cargando…

Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs

Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Fan, Chainani, Pranik, White, Tommy, Yang, Jin, Liu, Yu, Soibam, Benjamin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Taylor & Francis 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6333433/
https://www.ncbi.nlm.nih.gov/pubmed/30486737
http://dx.doi.org/10.1080/15476286.2018.1551704
_version_ 1783387562150199296
author Wang, Fan
Chainani, Pranik
White, Tommy
Yang, Jin
Liu, Yu
Soibam, Benjamin
author_facet Wang, Fan
Chainani, Pranik
White, Tommy
Yang, Jin
Liu, Yu
Soibam, Benjamin
author_sort Wang, Fan
collection PubMed
description Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models that predict the genome-wide binding sites deciphered by ChIRP-Seq experiments of 12 different lncRNAs. Among the several deep learning architectures tested, a simple architecture consisting of two convolutional neural network layers performed the best suggesting local sequence patterns as determinants of the interaction. Further interpretation of the kernels in the model revealed that these local sequence patterns form triplex structures with the corresponding lncRNAs. We uncovered several novel triplexes forming domains (TFDs) of these 12 lncRNAs and previously experimentally verified TFDs of lncRNAs HOTAIR and MEG3. We experimentally verified such two novel TFDs of lncRNAs HOTAIR and TUG1 predicted by our method (but previously unreported) using Electrophoretic mobility shift assays. In conclusion, we show that simple deep learning architecture can accurately predict genome-wide binding sites of lncRNAs and interpretation of the models suggest RNA:DNA:DNA triplex formation as a viable mechanism underlying lncRNA-DNA interactions at genome-wide level.
format Online
Article
Text
id pubmed-6333433
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Taylor & Francis
record_format MEDLINE/PubMed
spelling pubmed-63334332019-01-23 Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs Wang, Fan Chainani, Pranik White, Tommy Yang, Jin Liu, Yu Soibam, Benjamin RNA Biol Research Paper Long noncoding RNAs (lncRNAs) can exert their function by interacting with the DNA via triplex structure formation. Even though this has been validated with a handful of experiments, a genome-wide analysis of lncRNA-DNA binding is needed. In this paper, we develop and interpret deep learning models that predict the genome-wide binding sites deciphered by ChIRP-Seq experiments of 12 different lncRNAs. Among the several deep learning architectures tested, a simple architecture consisting of two convolutional neural network layers performed the best suggesting local sequence patterns as determinants of the interaction. Further interpretation of the kernels in the model revealed that these local sequence patterns form triplex structures with the corresponding lncRNAs. We uncovered several novel triplexes forming domains (TFDs) of these 12 lncRNAs and previously experimentally verified TFDs of lncRNAs HOTAIR and MEG3. We experimentally verified such two novel TFDs of lncRNAs HOTAIR and TUG1 predicted by our method (but previously unreported) using Electrophoretic mobility shift assays. In conclusion, we show that simple deep learning architecture can accurately predict genome-wide binding sites of lncRNAs and interpretation of the models suggest RNA:DNA:DNA triplex formation as a viable mechanism underlying lncRNA-DNA interactions at genome-wide level. Taylor & Francis 2018-11-28 /pmc/articles/PMC6333433/ /pubmed/30486737 http://dx.doi.org/10.1080/15476286.2018.1551704 Text en © 2018 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.
spellingShingle Research Paper
Wang, Fan
Chainani, Pranik
White, Tommy
Yang, Jin
Liu, Yu
Soibam, Benjamin
Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title_full Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title_fullStr Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title_full_unstemmed Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title_short Deep learning identifies genome-wide DNA binding sites of long noncoding RNAs
title_sort deep learning identifies genome-wide dna binding sites of long noncoding rnas
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6333433/
https://www.ncbi.nlm.nih.gov/pubmed/30486737
http://dx.doi.org/10.1080/15476286.2018.1551704
work_keys_str_mv AT wangfan deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas
AT chainanipranik deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas
AT whitetommy deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas
AT yangjin deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas
AT liuyu deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas
AT soibambenjamin deeplearningidentifiesgenomewidednabindingsitesoflongnoncodingrnas