Cargando…
AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability,...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667445/ https://www.ncbi.nlm.nih.gov/pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6 |
_version_ | 1784614388421885952 |
---|---|
author | Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran |
author_facet | Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran |
author_sort | Xiao, Li-Ming |
collection | PubMed |
description | BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS: To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION: With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04509-6. |
format | Online Article Text |
id | pubmed-8667445 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86674452021-12-13 AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran BMC Bioinformatics Research BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS: To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION: With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04509-6. BioMed Central 2021-12-13 /pmc/articles/PMC8667445/ /pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title | AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title_full | AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title_fullStr | AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title_full_unstemmed | AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title_short | AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity |
title_sort | attcrispr: a spacetime interpretable model for prediction of sgrna on-target activity |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667445/ https://www.ncbi.nlm.nih.gov/pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6 |
work_keys_str_mv | AT xiaoliming attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity AT wanyunqi attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity AT jiangzhenran attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity |