Cargando…

Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts

Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical meth...

Descripción completa

Detalles Bibliográficos
Autores principales: Portela, Rui M C, von Stosch, Moritz, Oliveira, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513808/
https://www.ncbi.nlm.nih.gov/pubmed/32995518
http://dx.doi.org/10.1093/synbio/ysy010
_version_ 1783586456296488960
author Portela, Rui M C
von Stosch, Moritz
Oliveira, Rui
author_facet Portela, Rui M C
von Stosch, Moritz
Oliveira, Rui
author_sort Portela, Rui M C
collection PubMed
description Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical methods typically require large data sets, which are difficult and/or costly to obtain. In this study, we explore for the first time the combination of both approaches within a formal hybrid semiparametric framework in an attempt to overcome the limitations of the current approaches. Protein translation as a function of the 5’ untranslated region sequence in Escherichia coli is taken as case study. Thermodynamic modeling, partial least squares (PLS) and hybrid parallel combinations thereof are compared for different data sets and data partitioning scenarios. The results suggest a significant and systematic reduction of both calibration and prediction errors by the hybrid approach in comparison to standalone thermodynamic or PLS modeling. Although with different magnitudes, improvements are observed irrespective of sample size and partitioning method. All in all the results suggest an increase of predictive power by the hybrid method potentially leading to a more efficient design of biological parts.
format Online
Article
Text
id pubmed-7513808
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-75138082020-09-28 Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts Portela, Rui M C von Stosch, Moritz Oliveira, Rui Synth Biol (Oxf) Research Article Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical methods typically require large data sets, which are difficult and/or costly to obtain. In this study, we explore for the first time the combination of both approaches within a formal hybrid semiparametric framework in an attempt to overcome the limitations of the current approaches. Protein translation as a function of the 5’ untranslated region sequence in Escherichia coli is taken as case study. Thermodynamic modeling, partial least squares (PLS) and hybrid parallel combinations thereof are compared for different data sets and data partitioning scenarios. The results suggest a significant and systematic reduction of both calibration and prediction errors by the hybrid approach in comparison to standalone thermodynamic or PLS modeling. Although with different magnitudes, improvements are observed irrespective of sample size and partitioning method. All in all the results suggest an increase of predictive power by the hybrid method potentially leading to a more efficient design of biological parts. Oxford University Press 2018-06-26 /pmc/articles/PMC7513808/ /pubmed/32995518 http://dx.doi.org/10.1093/synbio/ysy010 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research Article
Portela, Rui M C
von Stosch, Moritz
Oliveira, Rui
Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title_full Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title_fullStr Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title_full_unstemmed Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title_short Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
title_sort hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513808/
https://www.ncbi.nlm.nih.gov/pubmed/32995518
http://dx.doi.org/10.1093/synbio/ysy010
work_keys_str_mv AT portelaruimc hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts
AT vonstoschmoritz hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts
AT oliveirarui hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts