Cargando…
Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor
BACKGROUND: Two component systems (TCS) are signalling complexes manifested by a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes and control a wide range of biological processes. The pairing of these two components is hig...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4575426/ https://www.ncbi.nlm.nih.gov/pubmed/26384938 http://dx.doi.org/10.1186/s12859-015-0741-7 |
_version_ | 1782390773358002176 |
---|---|
author | Kara, Altan Vickers, Martin Swain, Martin Whitworth, David E. Fernandez-Fuentes, Narcis |
author_facet | Kara, Altan Vickers, Martin Swain, Martin Whitworth, David E. Fernandez-Fuentes, Narcis |
author_sort | Kara, Altan |
collection | PubMed |
description | BACKGROUND: Two component systems (TCS) are signalling complexes manifested by a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes and control a wide range of biological processes. The pairing of these two components is highly specific, often requiring costly and time-consuming experimental characterisation. Therefore, there is considerable interest in developing accurate prediction tools to lessen the burden of experimental work and cope with the ever-increasing amount of genomic information. RESULTS: We present a novel meta-predictor, MetaPred2CS, which is based on a support vector machine. MetaPred2CS integrates six sequence-based prediction methods: in-silico two-hybrid, mirror-tree, gene fusion, phylogenetic profiling, gene neighbourhood, and gene operon. To benchmark MetaPred2CS, we also compiled a novel high-quality training dataset of experimentally deduced TCS protein pairs for k-fold cross validation, to act as a gold standard for TCS partnership predictions. Combining individual predictions using MetaPred2CS improved performance when compared to the individual methods and in comparison with a current state-of-the-art meta-predictor. CONCLUSION: We have developed MetaPred2CS, a support vector machine-based metapredictor for prokaryotic TCS protein pairings. Central to the success of MetaPred2CS is a strategy of integrating individual predictors that improves the overall prediction accuracy, with the in-silico two-hybrid method contributing most to performance. MetaPred2CS outperformed other available systems in our benchmark tests, and is available online at http://metapred2cs.ibers.aber.ac.uk, along with our gold standard dataset of TCS interaction pairs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0741-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4575426 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45754262015-09-20 Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor Kara, Altan Vickers, Martin Swain, Martin Whitworth, David E. Fernandez-Fuentes, Narcis BMC Bioinformatics Methodology Article BACKGROUND: Two component systems (TCS) are signalling complexes manifested by a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes and control a wide range of biological processes. The pairing of these two components is highly specific, often requiring costly and time-consuming experimental characterisation. Therefore, there is considerable interest in developing accurate prediction tools to lessen the burden of experimental work and cope with the ever-increasing amount of genomic information. RESULTS: We present a novel meta-predictor, MetaPred2CS, which is based on a support vector machine. MetaPred2CS integrates six sequence-based prediction methods: in-silico two-hybrid, mirror-tree, gene fusion, phylogenetic profiling, gene neighbourhood, and gene operon. To benchmark MetaPred2CS, we also compiled a novel high-quality training dataset of experimentally deduced TCS protein pairs for k-fold cross validation, to act as a gold standard for TCS partnership predictions. Combining individual predictions using MetaPred2CS improved performance when compared to the individual methods and in comparison with a current state-of-the-art meta-predictor. CONCLUSION: We have developed MetaPred2CS, a support vector machine-based metapredictor for prokaryotic TCS protein pairings. Central to the success of MetaPred2CS is a strategy of integrating individual predictors that improves the overall prediction accuracy, with the in-silico two-hybrid method contributing most to performance. MetaPred2CS outperformed other available systems in our benchmark tests, and is available online at http://metapred2cs.ibers.aber.ac.uk, along with our gold standard dataset of TCS interaction pairs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0741-7) contains supplementary material, which is available to authorized users. BioMed Central 2015-09-18 /pmc/articles/PMC4575426/ /pubmed/26384938 http://dx.doi.org/10.1186/s12859-015-0741-7 Text en © Kara et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Kara, Altan Vickers, Martin Swain, Martin Whitworth, David E. Fernandez-Fuentes, Narcis Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title | Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title_full | Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title_fullStr | Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title_full_unstemmed | Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title_short | Genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
title_sort | genome-wide prediction of prokaryotic two-component system networks using a sequence-based meta-predictor |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4575426/ https://www.ncbi.nlm.nih.gov/pubmed/26384938 http://dx.doi.org/10.1186/s12859-015-0741-7 |
work_keys_str_mv | AT karaaltan genomewidepredictionofprokaryotictwocomponentsystemnetworksusingasequencebasedmetapredictor AT vickersmartin genomewidepredictionofprokaryotictwocomponentsystemnetworksusingasequencebasedmetapredictor AT swainmartin genomewidepredictionofprokaryotictwocomponentsystemnetworksusingasequencebasedmetapredictor AT whitworthdavide genomewidepredictionofprokaryotictwocomponentsystemnetworksusingasequencebasedmetapredictor AT fernandezfuentesnarcis genomewidepredictionofprokaryotictwocomponentsystemnetworksusingasequencebasedmetapredictor |