Cargando…

Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences

BACKGROUND: Precise and efficient methods for gene targeting are critical for detailed functional analysis of genomes and regulatory networks and for potentially improving the efficacy and safety of gene therapies. Oligomerized Pool ENgineering (OPEN) is a recently developed method for engineering C...

Descripción completa

Detalles Bibliográficos
Autores principales: Sander, Jeffry D, Reyon, Deepak, Maeder, Morgan L, Foley, Jonathan E, Thibodeau-Beganny, Stacey, Li, Xiaohong, Regan, Maureen R, Dahlborg, Elizabeth J, Goodwin, Mathew J, Fu, Fengli, Voytas, Daniel F, Joung, J Keith, Dobbs, Drena
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098093/
https://www.ncbi.nlm.nih.gov/pubmed/21044337
http://dx.doi.org/10.1186/1471-2105-11-543
_version_ 1782203915882725376
author Sander, Jeffry D
Reyon, Deepak
Maeder, Morgan L
Foley, Jonathan E
Thibodeau-Beganny, Stacey
Li, Xiaohong
Regan, Maureen R
Dahlborg, Elizabeth J
Goodwin, Mathew J
Fu, Fengli
Voytas, Daniel F
Joung, J Keith
Dobbs, Drena
author_facet Sander, Jeffry D
Reyon, Deepak
Maeder, Morgan L
Foley, Jonathan E
Thibodeau-Beganny, Stacey
Li, Xiaohong
Regan, Maureen R
Dahlborg, Elizabeth J
Goodwin, Mathew J
Fu, Fengli
Voytas, Daniel F
Joung, J Keith
Dobbs, Drena
author_sort Sander, Jeffry D
collection PubMed
description BACKGROUND: Precise and efficient methods for gene targeting are critical for detailed functional analysis of genomes and regulatory networks and for potentially improving the efficacy and safety of gene therapies. Oligomerized Pool ENgineering (OPEN) is a recently developed method for engineering C2H2 zinc finger proteins (ZFPs) designed to bind specific DNA sequences with high affinity and specificity in vivo. Because generation of ZFPs using OPEN requires considerable effort, a computational method for identifying the sites in any given gene that are most likely to be successfully targeted by this method is desirable. RESULTS: Analysis of the base composition of experimentally validated ZFP target sites identified important constraints on the DNA sequence space that can be effectively targeted using OPEN. Using alternate encodings to represent ZFP target sites, we implemented Naïve Bayes and Support Vector Machine classifiers capable of distinguishing "active" targets, i.e., ZFP binding sites that can be targeted with a high rate of success, from those that are "inactive" or poor targets for ZFPs generated using current OPEN technologies. When evaluated using leave-one-out cross-validation on a dataset of 135 experimentally validated ZFP target sites, the best Naïve Bayes classifier, designated ZiFOpT, achieved overall accuracy of 87% and specificity(+ )of 90%, with an ROC AUC of 0.89. When challenged with a completely independent test set of 140 newly validated ZFP target sites, ZiFOpT performance was comparable in terms of overall accuracy (88%) and specificity(+ )(92%), but with reduced ROC AUC (0.77). Users can rank potentially active ZFP target sites using a confidence score derived from the posterior probability returned by ZiFOpT. CONCLUSION: ZiFOpT, a machine learning classifier trained to identify DNA sequences amenable for targeting by OPEN-generated zinc finger arrays, can guide users to target sites that are most likely to function successfully in vivo, substantially reducing the experimental effort required. ZiFOpT is freely available and incorporated in the Zinc Finger Targeter web server (http://bindr.gdcb.iastate.edu/ZiFiT).
format Text
id pubmed-3098093
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980932011-07-08 Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences Sander, Jeffry D Reyon, Deepak Maeder, Morgan L Foley, Jonathan E Thibodeau-Beganny, Stacey Li, Xiaohong Regan, Maureen R Dahlborg, Elizabeth J Goodwin, Mathew J Fu, Fengli Voytas, Daniel F Joung, J Keith Dobbs, Drena BMC Bioinformatics Methodology Article BACKGROUND: Precise and efficient methods for gene targeting are critical for detailed functional analysis of genomes and regulatory networks and for potentially improving the efficacy and safety of gene therapies. Oligomerized Pool ENgineering (OPEN) is a recently developed method for engineering C2H2 zinc finger proteins (ZFPs) designed to bind specific DNA sequences with high affinity and specificity in vivo. Because generation of ZFPs using OPEN requires considerable effort, a computational method for identifying the sites in any given gene that are most likely to be successfully targeted by this method is desirable. RESULTS: Analysis of the base composition of experimentally validated ZFP target sites identified important constraints on the DNA sequence space that can be effectively targeted using OPEN. Using alternate encodings to represent ZFP target sites, we implemented Naïve Bayes and Support Vector Machine classifiers capable of distinguishing "active" targets, i.e., ZFP binding sites that can be targeted with a high rate of success, from those that are "inactive" or poor targets for ZFPs generated using current OPEN technologies. When evaluated using leave-one-out cross-validation on a dataset of 135 experimentally validated ZFP target sites, the best Naïve Bayes classifier, designated ZiFOpT, achieved overall accuracy of 87% and specificity(+ )of 90%, with an ROC AUC of 0.89. When challenged with a completely independent test set of 140 newly validated ZFP target sites, ZiFOpT performance was comparable in terms of overall accuracy (88%) and specificity(+ )(92%), but with reduced ROC AUC (0.77). Users can rank potentially active ZFP target sites using a confidence score derived from the posterior probability returned by ZiFOpT. CONCLUSION: ZiFOpT, a machine learning classifier trained to identify DNA sequences amenable for targeting by OPEN-generated zinc finger arrays, can guide users to target sites that are most likely to function successfully in vivo, substantially reducing the experimental effort required. ZiFOpT is freely available and incorporated in the Zinc Finger Targeter web server (http://bindr.gdcb.iastate.edu/ZiFiT). BioMed Central 2010-11-02 /pmc/articles/PMC3098093/ /pubmed/21044337 http://dx.doi.org/10.1186/1471-2105-11-543 Text en Copyright ©2010 Sander et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Sander, Jeffry D
Reyon, Deepak
Maeder, Morgan L
Foley, Jonathan E
Thibodeau-Beganny, Stacey
Li, Xiaohong
Regan, Maureen R
Dahlborg, Elizabeth J
Goodwin, Mathew J
Fu, Fengli
Voytas, Daniel F
Joung, J Keith
Dobbs, Drena
Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title_full Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title_fullStr Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title_full_unstemmed Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title_short Predicting success of oligomerized pool engineering (OPEN) for zinc finger target site sequences
title_sort predicting success of oligomerized pool engineering (open) for zinc finger target site sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098093/
https://www.ncbi.nlm.nih.gov/pubmed/21044337
http://dx.doi.org/10.1186/1471-2105-11-543
work_keys_str_mv AT sanderjeffryd predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT reyondeepak predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT maedermorganl predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT foleyjonathane predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT thibodeaubegannystacey predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT lixiaohong predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT reganmaureenr predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT dahlborgelizabethj predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT goodwinmathewj predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT fufengli predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT voytasdanielf predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT joungjkeith predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences
AT dobbsdrena predictingsuccessofoligomerizedpoolengineeringopenforzincfingertargetsitesequences