Cargando…

Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm

Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions o...

Descripción completa

Detalles Bibliográficos
Autores principales: Craig, Roger A., Lu, Jin, Luo, Jinquan, Shi, Lei, Liao, Li
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2811015/
https://www.ncbi.nlm.nih.gov/pubmed/19889723
http://dx.doi.org/10.1093/nar/gkp906
_version_ 1782176719802728448
author Craig, Roger A.
Lu, Jin
Luo, Jinquan
Shi, Lei
Liao, Li
author_facet Craig, Roger A.
Lu, Jin
Luo, Jinquan
Shi, Lei
Liao, Li
author_sort Craig, Roger A.
collection PubMed
description Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.
format Text
id pubmed-2811015
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28110152010-01-26 Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm Craig, Roger A. Lu, Jin Luo, Jinquan Shi, Lei Liao, Li Nucleic Acids Res Methods Online Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution. Oxford University Press 2010-01 2009-11-04 /pmc/articles/PMC2811015/ /pubmed/19889723 http://dx.doi.org/10.1093/nar/gkp906 Text en © The Author(s) 2009. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Craig, Roger A.
Lu, Jin
Luo, Jinquan
Shi, Lei
Liao, Li
Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title_full Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title_fullStr Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title_full_unstemmed Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title_short Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
title_sort optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2811015/
https://www.ncbi.nlm.nih.gov/pubmed/19889723
http://dx.doi.org/10.1093/nar/gkp906
work_keys_str_mv AT craigrogera optimizingnucleotidesequenceensemblesforcombinatorialproteinlibrariesusingageneticalgorithm
AT lujin optimizingnucleotidesequenceensemblesforcombinatorialproteinlibrariesusingageneticalgorithm
AT luojinquan optimizingnucleotidesequenceensemblesforcombinatorialproteinlibrariesusingageneticalgorithm
AT shilei optimizingnucleotidesequenceensemblesforcombinatorialproteinlibrariesusingageneticalgorithm
AT liaoli optimizingnucleotidesequenceensemblesforcombinatorialproteinlibrariesusingageneticalgorithm