Cargando…
InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator ann...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9026820/ https://www.ncbi.nlm.nih.gov/pubmed/35456427 http://dx.doi.org/10.3390/genes13040621 |
_version_ | 1784691206840647680 |
---|---|
author | Srinivasan, Shushrruth Sai Gong, Yanwen Xu, Siwei Hwang, Ahyeon Xu, Min Girgenti, Matthew J. Zhang, Jing |
author_facet | Srinivasan, Shushrruth Sai Gong, Yanwen Xu, Siwei Hwang, Ahyeon Xu, Min Girgenti, Matthew J. Zhang, Jing |
author_sort | Srinivasan, Shushrruth Sai |
collection | PubMed |
description | Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease. |
format | Online Article Text |
id | pubmed-9026820 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-90268202022-04-23 InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification Srinivasan, Shushrruth Sai Gong, Yanwen Xu, Siwei Hwang, Ahyeon Xu, Min Girgenti, Matthew J. Zhang, Jing Genes (Basel) Article Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease. MDPI 2022-03-30 /pmc/articles/PMC9026820/ /pubmed/35456427 http://dx.doi.org/10.3390/genes13040621 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Srinivasan, Shushrruth Sai Gong, Yanwen Xu, Siwei Hwang, Ahyeon Xu, Min Girgenti, Matthew J. Zhang, Jing InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title | InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title_full | InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title_fullStr | InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title_full_unstemmed | InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title_short | InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification |
title_sort | insulock: a weakly supervised learning approach for accurate insulator prediction, and variant impact quantification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9026820/ https://www.ncbi.nlm.nih.gov/pubmed/35456427 http://dx.doi.org/10.3390/genes13040621 |
work_keys_str_mv | AT srinivasanshushrruthsai insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT gongyanwen insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT xusiwei insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT hwangahyeon insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT xumin insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT girgentimatthewj insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification AT zhangjing insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification |