Cargando…

Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities

Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accur...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Heng, Yan, Jianfeng, Lu, Zhike, Zhou, Yangfan, Zhang, Qingfeng, Cui, Tingting, Li, Yini, Chen, Hui, Ma, Lijia
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Nature Singapore 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10188485/ https://www.ncbi.nlm.nih.gov/pubmed/37193681 http://dx.doi.org/10.1038/s41421-023-00549-9

_version_	1785042923921866752
author	Zhang, Heng Yan, Jianfeng Lu, Zhike Zhou, Yangfan Zhang, Qingfeng Cui, Tingting Li, Yini Chen, Hui Ma, Lijia
author_facet	Zhang, Heng Yan, Jianfeng Lu, Zhike Zhou, Yangfan Zhang, Qingfeng Cui, Tingting Li, Yini Chen, Hui Ma, Lijia
author_sort	Zhang, Heng
collection	PubMed
description	Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accurately predict gRNA activity and mutational patterns. However, the measurements are inconsistent between studies due to differences in the designs of the gRNA-target pair constructs, and there has not yet been an integrated investigation that concurrently focuses on multiple facets of gRNA capacity. In this study, we analyzed the DNA double-strand break (DSB)-induced repair outcomes and measured SpCas9/gRNA activities at both matched and mismatched locations using 926,476 gRNAs covering 19,111 protein-coding genes and 20,268 non-coding genes. We developed machine learning models to forecast the on-target cleavage efficiency (AIdit_ON), off-target cleavage specificity (AIdit_OFF), and mutational profiles (AIdit_DSB) of SpCas9/gRNA from a uniformly collected and processed dataset by deep sampling and massively quantifying gRNA capabilities in K562 cells. Each of these models exhibited superlative performance in predicting SpCas9/gRNA activities on independent datasets when benchmarked with previous models. A previous unknown parameter was also empirically determined regarding the “sweet spot” in the size of datasets used to establish an effective model to predict gRNA capabilities at a manageable experimental scale. In addition, we observed cell type-specific mutational profiles and were able to link nucleotidylexotransferase as the key factor driving these outcomes. These massive datasets and deep learning algorithms have been implemented into the user-friendly web service http://crispr-aidit.com to evaluate and rank gRNAs for life science studies.
format	Online Article Text
id	pubmed-10188485
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer Nature Singapore
record_format	MEDLINE/PubMed
spelling	pubmed-101884852023-05-18 Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities Zhang, Heng Yan, Jianfeng Lu, Zhike Zhou, Yangfan Zhang, Qingfeng Cui, Tingting Li, Yini Chen, Hui Ma, Lijia Cell Discov Article Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accurately predict gRNA activity and mutational patterns. However, the measurements are inconsistent between studies due to differences in the designs of the gRNA-target pair constructs, and there has not yet been an integrated investigation that concurrently focuses on multiple facets of gRNA capacity. In this study, we analyzed the DNA double-strand break (DSB)-induced repair outcomes and measured SpCas9/gRNA activities at both matched and mismatched locations using 926,476 gRNAs covering 19,111 protein-coding genes and 20,268 non-coding genes. We developed machine learning models to forecast the on-target cleavage efficiency (AIdit_ON), off-target cleavage specificity (AIdit_OFF), and mutational profiles (AIdit_DSB) of SpCas9/gRNA from a uniformly collected and processed dataset by deep sampling and massively quantifying gRNA capabilities in K562 cells. Each of these models exhibited superlative performance in predicting SpCas9/gRNA activities on independent datasets when benchmarked with previous models. A previous unknown parameter was also empirically determined regarding the “sweet spot” in the size of datasets used to establish an effective model to predict gRNA capabilities at a manageable experimental scale. In addition, we observed cell type-specific mutational profiles and were able to link nucleotidylexotransferase as the key factor driving these outcomes. These massive datasets and deep learning algorithms have been implemented into the user-friendly web service http://crispr-aidit.com to evaluate and rank gRNAs for life science studies. Springer Nature Singapore 2023-05-16 /pmc/articles/PMC10188485/ /pubmed/37193681 http://dx.doi.org/10.1038/s41421-023-00549-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Zhang, Heng Yan, Jianfeng Lu, Zhike Zhou, Yangfan Zhang, Qingfeng Cui, Tingting Li, Yini Chen, Hui Ma, Lijia Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title	Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title_full	Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title_fullStr	Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title_full_unstemmed	Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title_short	Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities
title_sort	deep sampling of grna in the human genome and deep-learning-informed prediction of grna activities
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10188485/ https://www.ncbi.nlm.nih.gov/pubmed/37193681 http://dx.doi.org/10.1038/s41421-023-00549-9
work_keys_str_mv	AT zhangheng deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT yanjianfeng deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT luzhike deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT zhouyangfan deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT zhangqingfeng deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT cuitingting deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT liyini deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT chenhui deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities AT malijia deepsamplingofgrnainthehumangenomeanddeeplearninginformedpredictionofgrnaactivities

Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities

Ejemplares similares