Cargando…

Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

MOTIVATION: Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequence...

Descripción completa

Detalles Bibliográficos
Autores principales:	Avsec, Žiga, Barekatain, Mohammadamin, Cheng, Jun, Gagneur, Julien
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2018
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5905632/ https://www.ncbi.nlm.nih.gov/pubmed/29155928 http://dx.doi.org/10.1093/bioinformatics/btx727

_version_	1783315293888577536
author	Avsec, Žiga Barekatain, Mohammadamin Cheng, Jun Gagneur, Julien
author_facet	Avsec, Žiga Barekatain, Mohammadamin Cheng, Jun Gagneur, Julien
author_sort	Avsec, Žiga
collection	PubMed
description	MOTIVATION: Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. RESULTS: Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 120 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox. AVAILABILITY AND IMPLEMENTATION: Spline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at https://github.com/gagneurlab/Manuscript_Avsec_Bioinformatics_2017. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-5905632
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-59056322018-04-23 Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks Avsec, Žiga Barekatain, Mohammadamin Cheng, Jun Gagneur, Julien Bioinformatics Original Papers MOTIVATION: Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. RESULTS: Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 120 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox. AVAILABILITY AND IMPLEMENTATION: Spline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at https://github.com/gagneurlab/Manuscript_Avsec_Bioinformatics_2017. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-04-15 2017-11-16 /pmc/articles/PMC5905632/ /pubmed/29155928 http://dx.doi.org/10.1093/bioinformatics/btx727 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Avsec, Žiga Barekatain, Mohammadamin Cheng, Jun Gagneur, Julien Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title	Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title_full	Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title_fullStr	Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title_full_unstemmed	Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title_short	Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
title_sort	modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5905632/ https://www.ncbi.nlm.nih.gov/pubmed/29155928 http://dx.doi.org/10.1093/bioinformatics/btx727
work_keys_str_mv	AT avsecziga modelingpositionaleffectsofregulatorysequenceswithsplinetransformationsincreasespredictionaccuracyofdeepneuralnetworks AT barekatainmohammadamin modelingpositionaleffectsofregulatorysequenceswithsplinetransformationsincreasespredictionaccuracyofdeepneuralnetworks AT chengjun modelingpositionaleffectsofregulatorysequenceswithsplinetransformationsincreasespredictionaccuracyofdeepneuralnetworks AT gagneurjulien modelingpositionaleffectsofregulatorysequenceswithsplinetransformationsincreasespredictionaccuracyofdeepneuralnetworks

Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

Ejemplares similares