Cargando…

AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity

BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability,...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xiao, Li-Ming, Wan, Yun-Qi, Jiang, Zhen-Ran
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667445/ https://www.ncbi.nlm.nih.gov/pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6

_version_	1784614388421885952
author	Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran
author_facet	Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran
author_sort	Xiao, Li-Ming
collection	PubMed
description	BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS: To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION: With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04509-6.
format	Online Article Text
id	pubmed-8667445
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-86674452021-12-13 AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran BMC Bioinformatics Research BACKGROUND: More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS: To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION: With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04509-6. BioMed Central 2021-12-13 /pmc/articles/PMC8667445/ /pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Xiao, Li-Ming Wan, Yun-Qi Jiang, Zhen-Ran AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title	AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title_full	AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title_fullStr	AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title_full_unstemmed	AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title_short	AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity
title_sort	attcrispr: a spacetime interpretable model for prediction of sgrna on-target activity
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667445/ https://www.ncbi.nlm.nih.gov/pubmed/34903170 http://dx.doi.org/10.1186/s12859-021-04509-6
work_keys_str_mv	AT xiaoliming attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity AT wanyunqi attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity AT jiangzhenran attcrispraspacetimeinterpretablemodelforpredictionofsgrnaontargetactivity

AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity

Ejemplares similares