Cargando…

Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins

Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial co...

Descripción completa

Detalles Bibliográficos
Autores principales: Seguritan, Victor, Alves, Nelson, Arnoult, Michael, Raymond, Amy, Lorimer, Don, Burgin, Alex B., Salamon, Peter, Segall, Anca M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426561/
https://www.ncbi.nlm.nih.gov/pubmed/22927809
http://dx.doi.org/10.1371/journal.pcbi.1002657
_version_ 1782241529527533568
author Seguritan, Victor
Alves, Nelson
Arnoult, Michael
Raymond, Amy
Lorimer, Don
Burgin, Alex B.
Salamon, Peter
Segall, Anca M.
author_facet Seguritan, Victor
Alves, Nelson
Arnoult, Michael
Raymond, Amy
Lorimer, Don
Burgin, Alex B.
Salamon, Peter
Segall, Anca M.
author_sort Seguritan, Victor
collection PubMed
description Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial communities and human health. However, phage ORFs are extremely diverse of which over 70% of them are dissimilar to any genes with annotated functions in GenBank. Better identification of viruses would also aid in better detection and diagnosis of disease, in vaccine development, and generally in better understanding the physiological potential of any environment. In contrast to enzymes, viral structural protein function can be much more challenging to detect from sequence data because of low sequence conservation, few known conserved catalytic sites or sequence domains, and relatively limited experimental data. We have designed a method of predicting phage structural protein sequences that uses Artificial Neural Networks (ANNs). First, we trained ANNs to classify viral structural proteins using amino acid frequency; these correctly classify a large fraction of test cases with a high degree of specificity and sensitivity. Subsequently, we added estimates of protein isoelectric points as a feature to ANNs that classify specialized families of proteins, namely major capsid and tail proteins. As expected, these more specialized ANNs are more accurate than the structural ANNs. To experimentally validate the ANN predictions, several ORFs with no significant similarities to known sequences that are ANN-predicted structural proteins were examined by transmission electron microscopy. Some of these self-assembled into structures strongly resembling virion structures. Thus, our ANNs are new tools for identifying phage and potential prophage structural proteins that are difficult or impossible to detect by other bioinformatic analysis. The networks will be valuable when sequence is available but in vitro propagation of the phage may not be practical or possible.
format Online
Article
Text
id pubmed-3426561
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34265612012-08-27 Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins Seguritan, Victor Alves, Nelson Arnoult, Michael Raymond, Amy Lorimer, Don Burgin, Alex B. Salamon, Peter Segall, Anca M. PLoS Comput Biol Research Article Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial communities and human health. However, phage ORFs are extremely diverse of which over 70% of them are dissimilar to any genes with annotated functions in GenBank. Better identification of viruses would also aid in better detection and diagnosis of disease, in vaccine development, and generally in better understanding the physiological potential of any environment. In contrast to enzymes, viral structural protein function can be much more challenging to detect from sequence data because of low sequence conservation, few known conserved catalytic sites or sequence domains, and relatively limited experimental data. We have designed a method of predicting phage structural protein sequences that uses Artificial Neural Networks (ANNs). First, we trained ANNs to classify viral structural proteins using amino acid frequency; these correctly classify a large fraction of test cases with a high degree of specificity and sensitivity. Subsequently, we added estimates of protein isoelectric points as a feature to ANNs that classify specialized families of proteins, namely major capsid and tail proteins. As expected, these more specialized ANNs are more accurate than the structural ANNs. To experimentally validate the ANN predictions, several ORFs with no significant similarities to known sequences that are ANN-predicted structural proteins were examined by transmission electron microscopy. Some of these self-assembled into structures strongly resembling virion structures. Thus, our ANNs are new tools for identifying phage and potential prophage structural proteins that are difficult or impossible to detect by other bioinformatic analysis. The networks will be valuable when sequence is available but in vitro propagation of the phage may not be practical or possible. Public Library of Science 2012-08-23 /pmc/articles/PMC3426561/ /pubmed/22927809 http://dx.doi.org/10.1371/journal.pcbi.1002657 Text en © 2012 Seguritan et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Seguritan, Victor
Alves, Nelson
Arnoult, Michael
Raymond, Amy
Lorimer, Don
Burgin, Alex B.
Salamon, Peter
Segall, Anca M.
Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title_full Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title_fullStr Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title_full_unstemmed Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title_short Artificial Neural Networks Trained to Detect Viral and Phage Structural Proteins
title_sort artificial neural networks trained to detect viral and phage structural proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3426561/
https://www.ncbi.nlm.nih.gov/pubmed/22927809
http://dx.doi.org/10.1371/journal.pcbi.1002657
work_keys_str_mv AT seguritanvictor artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT alvesnelson artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT arnoultmichael artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT raymondamy artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT lorimerdon artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT burginalexb artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT salamonpeter artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT segallancam artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins