Cargando…

DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays

MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Zhanlin, Zhang, Jing, Liu, Jason, Dai, Yi, Lee, Donghoon, Min, Martin Renqiang, Xu, Min, Gerstein, Mark
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2021
Materias:	Regulatory and Functional Genomics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275369/ https://www.ncbi.nlm.nih.gov/pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283

_version_	1783721699871555584
author	Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark
author_facet	Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark
author_sort	Chen, Zhanlin
collection	PubMed
description	MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. RESULTS: Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. AVAILABILITY AND IMPLEMENTATION: DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format	Online Article Text
id	pubmed-8275369
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-82753692021-07-13 DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark Bioinformatics Regulatory and Functional Genomics MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. RESULTS: Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. AVAILABILITY AND IMPLEMENTATION: DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8275369/ /pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Regulatory and Functional Genomics Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title	DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title_full	DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title_fullStr	DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title_full_unstemmed	DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title_short	DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
title_sort	decode: a deep-learning framework for condensing enhancers and refining boundaries with large-scale functional assays
topic	Regulatory and Functional Genomics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275369/ https://www.ncbi.nlm.nih.gov/pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283
work_keys_str_mv	AT chenzhanlin decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT zhangjing decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT liujason decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT daiyi decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT leedonghoon decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT minmartinrenqiang decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT xumin decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT gersteinmark decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays

DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays

Ejemplares similares