Cargando…
Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System
CRISPR/Cas9 technology is capable of precisely editing genomes and is at the heart of various scientific and medical advances in recent times. The advances in biomedical research are hindered because of the inadvertent burden on the genome when genome editors are employed—the off-target effects. Alt...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10135716/ https://www.ncbi.nlm.nih.gov/pubmed/37189388 http://dx.doi.org/10.3390/biom13040641 |
_version_ | 1785032045719715840 |
---|---|
author | Vora, Dhvani Sandip Yadav, Shashank Sundar, Durai |
author_facet | Vora, Dhvani Sandip Yadav, Shashank Sundar, Durai |
author_sort | Vora, Dhvani Sandip |
collection | PubMed |
description | CRISPR/Cas9 technology is capable of precisely editing genomes and is at the heart of various scientific and medical advances in recent times. The advances in biomedical research are hindered because of the inadvertent burden on the genome when genome editors are employed—the off-target effects. Although experimental screens to detect off-targets have allowed understanding the activity of Cas9, that knowledge remains incomplete as the rules do not extrapolate well to new target sequences. Off-target prediction tools developed recently have increasingly relied on machine learning and deep learning techniques to reliably understand the complete threat of likely off-targets because the rules that drive Cas9 activity are not fully understood. In this study, we present a count-based as well as deep-learning-based approach to derive sequence features that are important in deciding on Cas9 activity at a sequence. There are two major challenges in off-target determination—the identification of a likely site of Cas9 activity and the prediction of the extent of Cas9 activity at that site. The hybrid multitask CNN–biLSTM model developed, named CRISP–RCNN, simultaneously predicts off-targets and the extent of activity on off-targets. Employing methods of integrated gradients and weighting kernels for feature importance approximation, analysis of nucleotide and position preference, and mismatch tolerance have been performed. |
format | Online Article Text |
id | pubmed-10135716 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101357162023-04-28 Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System Vora, Dhvani Sandip Yadav, Shashank Sundar, Durai Biomolecules Article CRISPR/Cas9 technology is capable of precisely editing genomes and is at the heart of various scientific and medical advances in recent times. The advances in biomedical research are hindered because of the inadvertent burden on the genome when genome editors are employed—the off-target effects. Although experimental screens to detect off-targets have allowed understanding the activity of Cas9, that knowledge remains incomplete as the rules do not extrapolate well to new target sequences. Off-target prediction tools developed recently have increasingly relied on machine learning and deep learning techniques to reliably understand the complete threat of likely off-targets because the rules that drive Cas9 activity are not fully understood. In this study, we present a count-based as well as deep-learning-based approach to derive sequence features that are important in deciding on Cas9 activity at a sequence. There are two major challenges in off-target determination—the identification of a likely site of Cas9 activity and the prediction of the extent of Cas9 activity at that site. The hybrid multitask CNN–biLSTM model developed, named CRISP–RCNN, simultaneously predicts off-targets and the extent of activity on off-targets. Employing methods of integrated gradients and weighting kernels for feature importance approximation, analysis of nucleotide and position preference, and mismatch tolerance have been performed. MDPI 2023-04-03 /pmc/articles/PMC10135716/ /pubmed/37189388 http://dx.doi.org/10.3390/biom13040641 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Vora, Dhvani Sandip Yadav, Shashank Sundar, Durai Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title | Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title_full | Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title_fullStr | Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title_full_unstemmed | Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title_short | Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System |
title_sort | hybrid multitask learning reveals sequence features driving specificity in the crispr/cas9 system |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10135716/ https://www.ncbi.nlm.nih.gov/pubmed/37189388 http://dx.doi.org/10.3390/biom13040641 |
work_keys_str_mv | AT voradhvanisandip hybridmultitasklearningrevealssequencefeaturesdrivingspecificityinthecrisprcas9system AT yadavshashank hybridmultitasklearningrevealssequencefeaturesdrivingspecificityinthecrisprcas9system AT sundardurai hybridmultitasklearningrevealssequencefeaturesdrivingspecificityinthecrisprcas9system |