Cargando…

High accuracy operon prediction method based on STRING database scores

We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T...

Descripción completa

Detalles Bibliográficos
Autores principales: Taboada, Blanca, Verde, Cristina, Merino, Enrique
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896540/
https://www.ncbi.nlm.nih.gov/pubmed/20385580
http://dx.doi.org/10.1093/nar/gkq254
_version_ 1782183363592847360
author Taboada, Blanca
Verde, Cristina
Merino, Enrique
author_facet Taboada, Blanca
Verde, Cristina
Merino, Enrique
author_sort Taboada, Blanca
collection PubMed
description We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412–D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.
format Text
id pubmed-2896540
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28965402010-07-06 High accuracy operon prediction method based on STRING database scores Taboada, Blanca Verde, Cristina Merino, Enrique Nucleic Acids Res Methods Online We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412–D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/. Oxford University Press 2010-07 2010-04-12 /pmc/articles/PMC2896540/ /pubmed/20385580 http://dx.doi.org/10.1093/nar/gkq254 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Taboada, Blanca
Verde, Cristina
Merino, Enrique
High accuracy operon prediction method based on STRING database scores
title High accuracy operon prediction method based on STRING database scores
title_full High accuracy operon prediction method based on STRING database scores
title_fullStr High accuracy operon prediction method based on STRING database scores
title_full_unstemmed High accuracy operon prediction method based on STRING database scores
title_short High accuracy operon prediction method based on STRING database scores
title_sort high accuracy operon prediction method based on string database scores
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896540/
https://www.ncbi.nlm.nih.gov/pubmed/20385580
http://dx.doi.org/10.1093/nar/gkq254
work_keys_str_mv AT taboadablanca highaccuracyoperonpredictionmethodbasedonstringdatabasescores
AT verdecristina highaccuracyoperonpredictionmethodbasedonstringdatabasescores
AT merinoenrique highaccuracyoperonpredictionmethodbasedonstringdatabasescores