Cargando…

Mathematical model for empirically optimizing large scale production of soluble protein domains

BACKGROUND: Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirica...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chikayama, Eisuke, Kurotani, Atsushi, Tanaka, Takanori, Yabuki, Takashi, Miyazaki, Satoshi, Yokoyama, Shigeyuki, Kuroda, Yutaka
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2843616/ https://www.ncbi.nlm.nih.gov/pubmed/20193068 http://dx.doi.org/10.1186/1471-2105-11-113

_version_	1782179236817141760
author	Chikayama, Eisuke Kurotani, Atsushi Tanaka, Takanori Yabuki, Takashi Miyazaki, Satoshi Yokoyama, Shigeyuki Kuroda, Yutaka
author_facet	Chikayama, Eisuke Kurotani, Atsushi Tanaka, Takanori Yabuki, Takashi Miyazaki, Satoshi Yokoyama, Shigeyuki Kuroda, Yutaka
author_sort	Chikayama, Eisuke
collection	PubMed
description	BACKGROUND: Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research. RESULTS: The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain. CONCLUSIONS: Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected a priori. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.
format	Text
id	pubmed-2843616
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-28436162010-03-23 Mathematical model for empirically optimizing large scale production of soluble protein domains Chikayama, Eisuke Kurotani, Atsushi Tanaka, Takanori Yabuki, Takashi Miyazaki, Satoshi Yokoyama, Shigeyuki Kuroda, Yutaka BMC Bioinformatics Research article BACKGROUND: Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research. RESULTS: The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain. CONCLUSIONS: Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected a priori. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries. BioMed Central 2010-03-01 /pmc/articles/PMC2843616/ /pubmed/20193068 http://dx.doi.org/10.1186/1471-2105-11-113 Text en Copyright ©2010 Chikayama et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research article Chikayama, Eisuke Kurotani, Atsushi Tanaka, Takanori Yabuki, Takashi Miyazaki, Satoshi Yokoyama, Shigeyuki Kuroda, Yutaka Mathematical model for empirically optimizing large scale production of soluble protein domains
title	Mathematical model for empirically optimizing large scale production of soluble protein domains
title_full	Mathematical model for empirically optimizing large scale production of soluble protein domains
title_fullStr	Mathematical model for empirically optimizing large scale production of soluble protein domains
title_full_unstemmed	Mathematical model for empirically optimizing large scale production of soluble protein domains
title_short	Mathematical model for empirically optimizing large scale production of soluble protein domains
title_sort	mathematical model for empirically optimizing large scale production of soluble protein domains
topic	Research article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2843616/ https://www.ncbi.nlm.nih.gov/pubmed/20193068 http://dx.doi.org/10.1186/1471-2105-11-113
work_keys_str_mv	AT chikayamaeisuke mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT kurotaniatsushi mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT tanakatakanori mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT yabukitakashi mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT miyazakisatoshi mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT yokoyamashigeyuki mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains AT kurodayutaka mathematicalmodelforempiricallyoptimizinglargescaleproductionofsolubleproteindomains

Mathematical model for empirically optimizing large scale production of soluble protein domains

Ejemplares similares