Cargando…
Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts
Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical meth...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513808/ https://www.ncbi.nlm.nih.gov/pubmed/32995518 http://dx.doi.org/10.1093/synbio/ysy010 |
_version_ | 1783586456296488960 |
---|---|
author | Portela, Rui M C von Stosch, Moritz Oliveira, Rui |
author_facet | Portela, Rui M C von Stosch, Moritz Oliveira, Rui |
author_sort | Portela, Rui M C |
collection | PubMed |
description | Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical methods typically require large data sets, which are difficult and/or costly to obtain. In this study, we explore for the first time the combination of both approaches within a formal hybrid semiparametric framework in an attempt to overcome the limitations of the current approaches. Protein translation as a function of the 5’ untranslated region sequence in Escherichia coli is taken as case study. Thermodynamic modeling, partial least squares (PLS) and hybrid parallel combinations thereof are compared for different data sets and data partitioning scenarios. The results suggest a significant and systematic reduction of both calibration and prediction errors by the hybrid approach in comparison to standalone thermodynamic or PLS modeling. Although with different magnitudes, improvements are observed irrespective of sample size and partitioning method. All in all the results suggest an increase of predictive power by the hybrid method potentially leading to a more efficient design of biological parts. |
format | Online Article Text |
id | pubmed-7513808 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-75138082020-09-28 Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts Portela, Rui M C von Stosch, Moritz Oliveira, Rui Synth Biol (Oxf) Research Article Predicting the activity of modified biological parts is difficult due to the typically large size of nucleotide sequences, resulting in combinatorial designs that suffer from the “curse of dimensionality” problem. Mechanistic design methods are often limited by knowledge availability. Empirical methods typically require large data sets, which are difficult and/or costly to obtain. In this study, we explore for the first time the combination of both approaches within a formal hybrid semiparametric framework in an attempt to overcome the limitations of the current approaches. Protein translation as a function of the 5’ untranslated region sequence in Escherichia coli is taken as case study. Thermodynamic modeling, partial least squares (PLS) and hybrid parallel combinations thereof are compared for different data sets and data partitioning scenarios. The results suggest a significant and systematic reduction of both calibration and prediction errors by the hybrid approach in comparison to standalone thermodynamic or PLS modeling. Although with different magnitudes, improvements are observed irrespective of sample size and partitioning method. All in all the results suggest an increase of predictive power by the hybrid method potentially leading to a more efficient design of biological parts. Oxford University Press 2018-06-26 /pmc/articles/PMC7513808/ /pubmed/32995518 http://dx.doi.org/10.1093/synbio/ysy010 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research Article Portela, Rui M C von Stosch, Moritz Oliveira, Rui Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title | Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title_full | Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title_fullStr | Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title_full_unstemmed | Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title_short | Hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
title_sort | hybrid semiparametric systems for quantitative sequence-activity modeling of synthetic biological parts |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7513808/ https://www.ncbi.nlm.nih.gov/pubmed/32995518 http://dx.doi.org/10.1093/synbio/ysy010 |
work_keys_str_mv | AT portelaruimc hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts AT vonstoschmoritz hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts AT oliveirarui hybridsemiparametricsystemsforquantitativesequenceactivitymodelingofsyntheticbiologicalparts |