Cargando…

InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification

Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator ann...

Descripción completa

Detalles Bibliográficos
Autores principales: Srinivasan, Shushrruth Sai, Gong, Yanwen, Xu, Siwei, Hwang, Ahyeon, Xu, Min, Girgenti, Matthew J., Zhang, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9026820/
https://www.ncbi.nlm.nih.gov/pubmed/35456427
http://dx.doi.org/10.3390/genes13040621
_version_ 1784691206840647680
author Srinivasan, Shushrruth Sai
Gong, Yanwen
Xu, Siwei
Hwang, Ahyeon
Xu, Min
Girgenti, Matthew J.
Zhang, Jing
author_facet Srinivasan, Shushrruth Sai
Gong, Yanwen
Xu, Siwei
Hwang, Ahyeon
Xu, Min
Girgenti, Matthew J.
Zhang, Jing
author_sort Srinivasan, Shushrruth Sai
collection PubMed
description Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease.
format Online
Article
Text
id pubmed-9026820
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90268202022-04-23 InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification Srinivasan, Shushrruth Sai Gong, Yanwen Xu, Siwei Hwang, Ahyeon Xu, Min Girgenti, Matthew J. Zhang, Jing Genes (Basel) Article Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease. MDPI 2022-03-30 /pmc/articles/PMC9026820/ /pubmed/35456427 http://dx.doi.org/10.3390/genes13040621 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Srinivasan, Shushrruth Sai
Gong, Yanwen
Xu, Siwei
Hwang, Ahyeon
Xu, Min
Girgenti, Matthew J.
Zhang, Jing
InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title_full InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title_fullStr InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title_full_unstemmed InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title_short InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification
title_sort insulock: a weakly supervised learning approach for accurate insulator prediction, and variant impact quantification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9026820/
https://www.ncbi.nlm.nih.gov/pubmed/35456427
http://dx.doi.org/10.3390/genes13040621
work_keys_str_mv AT srinivasanshushrruthsai insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT gongyanwen insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT xusiwei insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT hwangahyeon insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT xumin insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT girgentimatthewj insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification
AT zhangjing insulockaweaklysupervisedlearningapproachforaccurateinsulatorpredictionandvariantimpactquantification