Cargando…

Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts

Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical meth...

Descripción completa

Detalles Bibliográficos
Autores principales: Portela, Rui M C, von Stosch, Moritz, Oliveira, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513808/
https://www.ncbi.nlm.nih.gov/pubmed/32995518
http://dx.doi.org/10.1093/synbio/ysy010
Descripción
Sumario:Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical methods typically require large data sets, which are difficult and/or costly to obtain. In this study, we explore for the first time the combination of both approaches within a formal hybrid semiparametric framework in an attempt to overcome the limitations of the current approaches. Protein translation as a function of the 5’ untranslated region sequence in Escherichia coli is taken as case study. Thermodynamic modeling, partial least squares (PLS) and hybrid parallel combinations thereof are compared for different data sets and data partitioning scenarios. The results suggest a significant and systematic reduction of both calibration and prediction errors by the hybrid approach in comparison to standalone thermodynamic or PLS modeling. Although with different magnitudes, improvements are observed irrespective of sample size and partitioning method. All in all the results suggest an increase of predictive power by the hybrid method potentially leading to a more efficient design of biological parts.