Cargando…

MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions

BACKGROUND: The interactions between pathogen proteins and their hosts allow pathogens to manipulate host cellular mechanisms to their advantage. The identification of host proteins that are targeted by virulent pathogen proteins is crucial to increase our understanding of infection mechanisms and t...

Descripción completa

Detalles Bibliográficos
Autores principales: Arenas, Ailan F, Salcedo, Gladys E, Montoya, Andrey M, Gomez-Marin, Jorge E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448560/
https://www.ncbi.nlm.nih.gov/pubmed/25963052
http://dx.doi.org/10.1186/s12859-015-0599-8
_version_ 1782373729070743552
author Arenas, Ailan F
Salcedo, Gladys E
Montoya, Andrey M
Gomez-Marin, Jorge E
author_facet Arenas, Ailan F
Salcedo, Gladys E
Montoya, Andrey M
Gomez-Marin, Jorge E
author_sort Arenas, Ailan F
collection PubMed
description BACKGROUND: The interactions between pathogen proteins and their hosts allow pathogens to manipulate host cellular mechanisms to their advantage. The identification of host proteins that are targeted by virulent pathogen proteins is crucial to increase our understanding of infection mechanisms and to propose new therapeutics that target pathogens. Understanding the virulence mechanisms of pathogens requires a detailed molecular description of the proteins involved, but acquiring this knowledge is time consuming and prohibitively expensive. Therefore, we develop a statistical method based on hypothesis testing to compare the time series obtained from conversion of the physicochemical characteristics of the amino acids that form the primary structure of proteins and thus to propose potential functional relation between proteins. We called this algorithm the multiple spectral comparison algorithm (MSCA); the MSCA was inspired by the BLASTP tool and was implemented in R code. The algorithm compares and relates multiple time series according to their spectral similarities, and the biological relation between them could be interpreted as either a similar function or protein-protein interaction (PPI). RESULTS: A simulation study showed that the MSCA works satisfactorily well when we compare unequal time series generated from ARMA processes because its power was close to 1. The MSCA presented a 70% average accuracy of detecting protein interactions using a threshold of 0.7 for our spectral measure, indicating that this algorithm could predict novel PPIs and pathogen-host interactions (PHIs) with acceptable confidence. The MSCA also was validated by its identification of well-known interactions of the human proteins MAGI1, SCRIB and JAK1, as well as interactions of the virulence proteins ROP16, ROP18, ROP17 and ROP5. We verified the spectral similarities for human intraspecific PPIs and PHIs that were previously demonstrated experimentally by other authors. We suggest that human GBP (GTPase group induced by interferon) and the CREB transcription factor family could be human substrates for the complex of ROP18, ROP17 and ROP5. CONCLUSIONS: Using multiple-hypothesis testing between the spectral densities of a set of unequal time series, we developed an algorithm that is able to identify the similarities or interactions between a set of proteins. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0599-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4448560
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44485602015-05-30 MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions Arenas, Ailan F Salcedo, Gladys E Montoya, Andrey M Gomez-Marin, Jorge E BMC Bioinformatics Research Article BACKGROUND: The interactions between pathogen proteins and their hosts allow pathogens to manipulate host cellular mechanisms to their advantage. The identification of host proteins that are targeted by virulent pathogen proteins is crucial to increase our understanding of infection mechanisms and to propose new therapeutics that target pathogens. Understanding the virulence mechanisms of pathogens requires a detailed molecular description of the proteins involved, but acquiring this knowledge is time consuming and prohibitively expensive. Therefore, we develop a statistical method based on hypothesis testing to compare the time series obtained from conversion of the physicochemical characteristics of the amino acids that form the primary structure of proteins and thus to propose potential functional relation between proteins. We called this algorithm the multiple spectral comparison algorithm (MSCA); the MSCA was inspired by the BLASTP tool and was implemented in R code. The algorithm compares and relates multiple time series according to their spectral similarities, and the biological relation between them could be interpreted as either a similar function or protein-protein interaction (PPI). RESULTS: A simulation study showed that the MSCA works satisfactorily well when we compare unequal time series generated from ARMA processes because its power was close to 1. The MSCA presented a 70% average accuracy of detecting protein interactions using a threshold of 0.7 for our spectral measure, indicating that this algorithm could predict novel PPIs and pathogen-host interactions (PHIs) with acceptable confidence. The MSCA also was validated by its identification of well-known interactions of the human proteins MAGI1, SCRIB and JAK1, as well as interactions of the virulence proteins ROP16, ROP18, ROP17 and ROP5. We verified the spectral similarities for human intraspecific PPIs and PHIs that were previously demonstrated experimentally by other authors. We suggest that human GBP (GTPase group induced by interferon) and the CREB transcription factor family could be human substrates for the complex of ROP18, ROP17 and ROP5. CONCLUSIONS: Using multiple-hypothesis testing between the spectral densities of a set of unequal time series, we developed an algorithm that is able to identify the similarities or interactions between a set of proteins. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0599-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-05-13 /pmc/articles/PMC4448560/ /pubmed/25963052 http://dx.doi.org/10.1186/s12859-015-0599-8 Text en © Arenas et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Arenas, Ailan F
Salcedo, Gladys E
Montoya, Andrey M
Gomez-Marin, Jorge E
MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title_full MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title_fullStr MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title_full_unstemmed MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title_short MSCA: a spectral comparison algorithm between time series to identify protein-protein interactions
title_sort msca: a spectral comparison algorithm between time series to identify protein-protein interactions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448560/
https://www.ncbi.nlm.nih.gov/pubmed/25963052
http://dx.doi.org/10.1186/s12859-015-0599-8
work_keys_str_mv AT arenasailanf mscaaspectralcomparisonalgorithmbetweentimeseriestoidentifyproteinproteininteractions
AT salcedogladyse mscaaspectralcomparisonalgorithmbetweentimeseriestoidentifyproteinproteininteractions
AT montoyaandreym mscaaspectralcomparisonalgorithmbetweentimeseriestoidentifyproteinproteininteractions
AT gomezmarinjorgee mscaaspectralcomparisonalgorithmbetweentimeseriestoidentifyproteinproteininteractions