Cargando…
DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays
MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275369/ https://www.ncbi.nlm.nih.gov/pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283 |
_version_ | 1783721699871555584 |
---|---|
author | Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark |
author_facet | Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark |
author_sort | Chen, Zhanlin |
collection | PubMed |
description | MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. RESULTS: Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. AVAILABILITY AND IMPLEMENTATION: DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8275369 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-82753692021-07-13 DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark Bioinformatics Regulatory and Functional Genomics MOTIVATION: Mapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have implemented enhancer discovery as a binary classification problem without accurate boundary detection, producing low-resolution annotations with superfluous regions and reducing the statistical power for downstream analyses (e.g. causal variant mapping and functional validations). Here, we addressed these challenges via a two-step model called Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays (DECODE). First, we employed direct enhancer-activity readouts from novel functional characterization assays, such as STARR-seq, to train a deep neural network for accurate cell-type-specific enhancer prediction. Second, to improve the annotation resolution, we implemented a weakly supervised object detection framework for enhancer localization with precise boundary detection (to a 10 bp resolution) using Gradient-weighted Class Activation Mapping. RESULTS: Our DECODE binary classifier outperformed a state-of-the-art enhancer prediction method by 24% in transgenic mouse validation. Furthermore, the object detection framework can condense enhancer annotations to only 13% of their original size, and these compact annotations have significantly higher conservation scores and genome-wide association study variant enrichments than the original predictions. Overall, DECODE is an effective tool for enhancer classification and precise localization. AVAILABILITY AND IMPLEMENTATION: DECODE source code and pre-processing scripts are available at decode.gersteinlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8275369/ /pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Regulatory and Functional Genomics Chen, Zhanlin Zhang, Jing Liu, Jason Dai, Yi Lee, Donghoon Min, Martin Renqiang Xu, Min Gerstein, Mark DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title | DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title_full | DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title_fullStr | DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title_full_unstemmed | DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title_short | DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays |
title_sort | decode: a deep-learning framework for condensing enhancers and refining boundaries with large-scale functional assays |
topic | Regulatory and Functional Genomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275369/ https://www.ncbi.nlm.nih.gov/pubmed/34252960 http://dx.doi.org/10.1093/bioinformatics/btab283 |
work_keys_str_mv | AT chenzhanlin decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT zhangjing decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT liujason decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT daiyi decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT leedonghoon decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT minmartinrenqiang decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT xumin decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays AT gersteinmark decodeadeeplearningframeworkforcondensingenhancersandrefiningboundarieswithlargescalefunctionalassays |