Cargando…

Neural networks with circular filters enable data efficient inference of sequence motifs

MOTIVATION: Nucleic acids and proteins often have localized sequence motifs that enable highly specific interactions. Due to the biological relevance of sequence motifs, numerous inference methods have been developed. Recently, convolutional neural networks (CNNs) have achieved state of the art perf...

Descripción completa

Detalles Bibliográficos
Autores principales:	Blum, Christopher F, Kollmann, Markus
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792110/ https://www.ncbi.nlm.nih.gov/pubmed/30918943 http://dx.doi.org/10.1093/bioinformatics/btz194

_version_	1783459083525816320
author	Blum, Christopher F Kollmann, Markus
author_facet	Blum, Christopher F Kollmann, Markus
author_sort	Blum, Christopher F
collection	PubMed
description	MOTIVATION: Nucleic acids and proteins often have localized sequence motifs that enable highly specific interactions. Due to the biological relevance of sequence motifs, numerous inference methods have been developed. Recently, convolutional neural networks (CNNs) have achieved state of the art performance. These methods were able to learn transcription factor binding sites from ChIP-seq data, resulting in accurate predictions on test data. However, CNNs typically distribute learned motifs across multiple filters, making them difficult to interpret. Furthermore, networks trained on small datasets often do not generalize well to new sequences. RESULTS: Here we present circular filters, a novel convolutional architecture, that convolves sequences with circularly permutated variants of the same filter. We motivate circular filters by the observation that CNNs frequently learn filters that correspond to shifted and truncated variants of the true motif. Circular filters enable learning of full-length motifs and allow easy interpretation of the learned filters. We show that circular filters improve motif inference performance over a wide range of hyperparameters as well as sequence length. Furthermore, we show that CNNs with circular filters in most cases outperform conventional CNNs at inferring DNA binding sites from ChIP-seq data. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/christopherblum. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-6792110
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-67921102019-10-18 Neural networks with circular filters enable data efficient inference of sequence motifs Blum, Christopher F Kollmann, Markus Bioinformatics Original Papers MOTIVATION: Nucleic acids and proteins often have localized sequence motifs that enable highly specific interactions. Due to the biological relevance of sequence motifs, numerous inference methods have been developed. Recently, convolutional neural networks (CNNs) have achieved state of the art performance. These methods were able to learn transcription factor binding sites from ChIP-seq data, resulting in accurate predictions on test data. However, CNNs typically distribute learned motifs across multiple filters, making them difficult to interpret. Furthermore, networks trained on small datasets often do not generalize well to new sequences. RESULTS: Here we present circular filters, a novel convolutional architecture, that convolves sequences with circularly permutated variants of the same filter. We motivate circular filters by the observation that CNNs frequently learn filters that correspond to shifted and truncated variants of the true motif. Circular filters enable learning of full-length motifs and allow easy interpretation of the learned filters. We show that circular filters improve motif inference performance over a wide range of hyperparameters as well as sequence length. Furthermore, we show that CNNs with circular filters in most cases outperform conventional CNNs at inferring DNA binding sites from ChIP-seq data. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/christopherblum. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-10-15 2019-03-27 /pmc/articles/PMC6792110/ /pubmed/30918943 http://dx.doi.org/10.1093/bioinformatics/btz194 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Original Papers Blum, Christopher F Kollmann, Markus Neural networks with circular filters enable data efficient inference of sequence motifs
title	Neural networks with circular filters enable data efficient inference of sequence motifs
title_full	Neural networks with circular filters enable data efficient inference of sequence motifs
title_fullStr	Neural networks with circular filters enable data efficient inference of sequence motifs
title_full_unstemmed	Neural networks with circular filters enable data efficient inference of sequence motifs
title_short	Neural networks with circular filters enable data efficient inference of sequence motifs
title_sort	neural networks with circular filters enable data efficient inference of sequence motifs
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792110/ https://www.ncbi.nlm.nih.gov/pubmed/30918943 http://dx.doi.org/10.1093/bioinformatics/btz194
work_keys_str_mv	AT blumchristopherf neuralnetworkswithcircularfiltersenabledataefficientinferenceofsequencemotifs AT kollmannmarkus neuralnetworkswithcircularfiltersenabledataefficientinferenceofsequencemotifs

Neural networks with circular filters enable data efficient inference of sequence motifs

Ejemplares similares