Cargando…

An intrinsically interpretable neural network architecture for sequence-to-function learning

MOTIVATION: Sequence-based deep learning approaches have been shown to predict a multitude of functional genomic readouts, including regions of open chromatin and RNA expression of genes. However, a major limitation of current methods is that model interpretation relies on computationally demanding...

Descripción completa

Detalles Bibliográficos
Autores principales:	Balcı, Ali Tuğrul, Ebeid, Mark Maher, Benos, Panayiotis V, Kostka, Dennis, Chikina, Maria
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Regulatory and Functional Genomics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311317/ https://www.ncbi.nlm.nih.gov/pubmed/37387140 http://dx.doi.org/10.1093/bioinformatics/btad271

_version_	1785066717754425344
author	Balcı, Ali Tuğrul Ebeid, Mark Maher Benos, Panayiotis V Kostka, Dennis Chikina, Maria
author_facet	Balcı, Ali Tuğrul Ebeid, Mark Maher Benos, Panayiotis V Kostka, Dennis Chikina, Maria
author_sort	Balcı, Ali Tuğrul
collection	PubMed
description	MOTIVATION: Sequence-based deep learning approaches have been shown to predict a multitude of functional genomic readouts, including regions of open chromatin and RNA expression of genes. However, a major limitation of current methods is that model interpretation relies on computationally demanding post hoc analyses, and even then, one can often not explain the internal mechanics of highly parameterized models. Here, we introduce a deep learning architecture called totally interpretable sequence-to-function model (tiSFM). tiSFM improves upon the performance of standard multilayer convolutional models while using fewer parameters. Additionally, while tiSFM is itself technically a multilayer neural network, internal model parameters are intrinsically interpretable in terms of relevant sequence motifs. RESULTS: We analyze published open chromatin measurements across hematopoietic lineage cell-types and demonstrate that tiSFM outperforms a state-of-the-art convolutional neural network model custom-tailored to this dataset. We also show that it correctly identifies context-specific activities of transcription factors with known roles in hematopoietic differentiation, including Pax5 and Ebf1 for B-cells, and Rorc for innate lymphoid cells. tiSFM’s model parameters have biologically meaningful interpretations, and we show the utility of our approach on a complex task of predicting the change in epigenetic state as a function of developmental transition. AVAILABILITY AND IMPLEMENTATION: The source code, including scripts for the analysis of key findings, can be found at https://github.com/boooooogey/ATAConv, implemented in Python.
format	Online Article Text
id	pubmed-10311317
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-103113172023-07-01 An intrinsically interpretable neural network architecture for sequence-to-function learning Balcı, Ali Tuğrul Ebeid, Mark Maher Benos, Panayiotis V Kostka, Dennis Chikina, Maria Bioinformatics Regulatory and Functional Genomics MOTIVATION: Sequence-based deep learning approaches have been shown to predict a multitude of functional genomic readouts, including regions of open chromatin and RNA expression of genes. However, a major limitation of current methods is that model interpretation relies on computationally demanding post hoc analyses, and even then, one can often not explain the internal mechanics of highly parameterized models. Here, we introduce a deep learning architecture called totally interpretable sequence-to-function model (tiSFM). tiSFM improves upon the performance of standard multilayer convolutional models while using fewer parameters. Additionally, while tiSFM is itself technically a multilayer neural network, internal model parameters are intrinsically interpretable in terms of relevant sequence motifs. RESULTS: We analyze published open chromatin measurements across hematopoietic lineage cell-types and demonstrate that tiSFM outperforms a state-of-the-art convolutional neural network model custom-tailored to this dataset. We also show that it correctly identifies context-specific activities of transcription factors with known roles in hematopoietic differentiation, including Pax5 and Ebf1 for B-cells, and Rorc for innate lymphoid cells. tiSFM’s model parameters have biologically meaningful interpretations, and we show the utility of our approach on a complex task of predicting the change in epigenetic state as a function of developmental transition. AVAILABILITY AND IMPLEMENTATION: The source code, including scripts for the analysis of key findings, can be found at https://github.com/boooooogey/ATAConv, implemented in Python. Oxford University Press 2023-06-30 /pmc/articles/PMC10311317/ /pubmed/37387140 http://dx.doi.org/10.1093/bioinformatics/btad271 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Regulatory and Functional Genomics Balcı, Ali Tuğrul Ebeid, Mark Maher Benos, Panayiotis V Kostka, Dennis Chikina, Maria An intrinsically interpretable neural network architecture for sequence-to-function learning
title	An intrinsically interpretable neural network architecture for sequence-to-function learning
title_full	An intrinsically interpretable neural network architecture for sequence-to-function learning
title_fullStr	An intrinsically interpretable neural network architecture for sequence-to-function learning
title_full_unstemmed	An intrinsically interpretable neural network architecture for sequence-to-function learning
title_short	An intrinsically interpretable neural network architecture for sequence-to-function learning
title_sort	intrinsically interpretable neural network architecture for sequence-to-function learning
topic	Regulatory and Functional Genomics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311317/ https://www.ncbi.nlm.nih.gov/pubmed/37387140 http://dx.doi.org/10.1093/bioinformatics/btad271
work_keys_str_mv	AT balcıalitugrul anintrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT ebeidmarkmaher anintrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT benospanayiotisv anintrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT kostkadennis anintrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT chikinamaria anintrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT balcıalitugrul intrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT ebeidmarkmaher intrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT benospanayiotisv intrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT kostkadennis intrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning AT chikinamaria intrinsicallyinterpretableneuralnetworkarchitectureforsequencetofunctionlearning

An intrinsically interpretable neural network architecture for sequence-to-function learning

Ejemplares similares