Cargando…

Deep learning for genomics using Janggu

In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed dataset and/or by a fixed model architecture. Here...

Descripción completa

Detalles Bibliográficos
Autores principales: Kopp, Wolfgang, Monti, Remo, Tamburrini, Annalaura, Ohler, Uwe, Akalin, Altuna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359359/
https://www.ncbi.nlm.nih.gov/pubmed/32661261
http://dx.doi.org/10.1038/s41467-020-17155-y
_version_ 1783559033111707648
author Kopp, Wolfgang
Monti, Remo
Tamburrini, Annalaura
Ohler, Uwe
Akalin, Altuna
author_facet Kopp, Wolfgang
Monti, Remo
Tamburrini, Annalaura
Ohler, Uwe
Akalin, Altuna
author_sort Kopp, Wolfgang
collection PubMed
description In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed dataset and/or by a fixed model architecture. Here we present Janggu, a python library facilitates deep learning for genomics applications, aiming to ease data acquisition and model evaluation. Among its key features are special dataset objects, which form a unified and flexible data acquisition and pre-processing framework for genomics data that enables streamlining of future research applications through reusable components. Through a numpy-like interface, these dataset objects are directly compatible with popular deep learning libraries, including keras or pytorch. Janggu offers the possibility to visualize predictions as genomic tracks or by exporting them to the bigWig format as well as utilities for keras-based models. We illustrate the functionality of Janggu on several deep learning genomics applications. First, we evaluate different model topologies for the task of predicting binding sites for the transcription factor JunD. Second, we demonstrate the framework on published models for predicting chromatin effects. Third, we show that promoter usage measured by CAGE can be predicted using DNase hypersensitivity, histone modifications and DNA sequence features. We improve the performance of these models due to a novel feature in Janggu that allows us to include high-order sequence features. We believe that Janggu will help to significantly reduce repetitive programming overhead for deep learning applications in genomics, and will enable computational biologists to rapidly assess biological hypotheses.
format Online
Article
Text
id pubmed-7359359
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73593592020-07-20 Deep learning for genomics using Janggu Kopp, Wolfgang Monti, Remo Tamburrini, Annalaura Ohler, Uwe Akalin, Altuna Nat Commun Article In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed dataset and/or by a fixed model architecture. Here we present Janggu, a python library facilitates deep learning for genomics applications, aiming to ease data acquisition and model evaluation. Among its key features are special dataset objects, which form a unified and flexible data acquisition and pre-processing framework for genomics data that enables streamlining of future research applications through reusable components. Through a numpy-like interface, these dataset objects are directly compatible with popular deep learning libraries, including keras or pytorch. Janggu offers the possibility to visualize predictions as genomic tracks or by exporting them to the bigWig format as well as utilities for keras-based models. We illustrate the functionality of Janggu on several deep learning genomics applications. First, we evaluate different model topologies for the task of predicting binding sites for the transcription factor JunD. Second, we demonstrate the framework on published models for predicting chromatin effects. Third, we show that promoter usage measured by CAGE can be predicted using DNase hypersensitivity, histone modifications and DNA sequence features. We improve the performance of these models due to a novel feature in Janggu that allows us to include high-order sequence features. We believe that Janggu will help to significantly reduce repetitive programming overhead for deep learning applications in genomics, and will enable computational biologists to rapidly assess biological hypotheses. Nature Publishing Group UK 2020-07-13 /pmc/articles/PMC7359359/ /pubmed/32661261 http://dx.doi.org/10.1038/s41467-020-17155-y Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Kopp, Wolfgang
Monti, Remo
Tamburrini, Annalaura
Ohler, Uwe
Akalin, Altuna
Deep learning for genomics using Janggu
title Deep learning for genomics using Janggu
title_full Deep learning for genomics using Janggu
title_fullStr Deep learning for genomics using Janggu
title_full_unstemmed Deep learning for genomics using Janggu
title_short Deep learning for genomics using Janggu
title_sort deep learning for genomics using janggu
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7359359/
https://www.ncbi.nlm.nih.gov/pubmed/32661261
http://dx.doi.org/10.1038/s41467-020-17155-y
work_keys_str_mv AT koppwolfgang deeplearningforgenomicsusingjanggu
AT montiremo deeplearningforgenomicsusingjanggu
AT tamburriniannalaura deeplearningforgenomicsusingjanggu
AT ohleruwe deeplearningforgenomicsusingjanggu
AT akalinaltuna deeplearningforgenomicsusingjanggu