Cargando…

Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships

BACKGROUND: Characterizing phage–host interactions is critical to understanding the ecological role of both partners and effective isolation of phage therapeuticals. Unfortunately, experimental methods for studying these interactions are markedly slow, low-throughput, and unsuitable for phages or ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Zielezinski, Andrzej, Barylski, Jakub, Karlowski, Wojciech M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8501573/
https://www.ncbi.nlm.nih.gov/pubmed/34625070
http://dx.doi.org/10.1186/s12915-021-01146-6
_version_ 1784580711444905984
author Zielezinski, Andrzej
Barylski, Jakub
Karlowski, Wojciech M.
author_facet Zielezinski, Andrzej
Barylski, Jakub
Karlowski, Wojciech M.
author_sort Zielezinski, Andrzej
collection PubMed
description BACKGROUND: Characterizing phage–host interactions is critical to understanding the ecological role of both partners and effective isolation of phage therapeuticals. Unfortunately, experimental methods for studying these interactions are markedly slow, low-throughput, and unsuitable for phages or hosts difficult to maintain in laboratory conditions. Therefore, a number of in silico methods emerged to predict prokaryotic hosts based on viral sequences. One of the leading approaches is the application of the BLAST tool that searches for local similarities between viral and microbial genomes. However, this prediction method has three major limitations: (i) top-scoring sequences do not always point to the actual host; (ii) mosaic virus genomes may match to many, typically related, bacteria; and (iii) viral and host sequences may diverge beyond the point where their relationship can be detected by a BLAST alignment. RESULTS: We created an extension to BLAST, named Phirbo, that improves host prediction quality beyond what is obtainable from standard BLAST searches. The tool harnesses information concerning sequence similarity and bacteria relatedness to predict phage–host interactions. Phirbo was evaluated on three benchmark sets of known virus–host pairs, and it improved precision and recall by 11–40 percentage points over currently available, state-of-the-art, alignment-based, alignment-free, and machine-learning host prediction tools. Moreover, the discriminatory power of Phirbo for the recognition of virus–host relationships surpassed the results of other tools by at least 10 percentage points (area under the curve = 0.95), yielding a mean host prediction accuracy of 57% and 68% at the genus and family levels, respectively, and drops by 12 percentage points when using only a fraction of viral genome sequences (3 kb). Finally, we provide insights into a repertoire of protein and ncRNA genes that are shared between phages and hosts and may be prone to horizontal transfer during infection. CONCLUSIONS: Our results suggest that Phirbo is a simple and effective tool for predicting phage–host relationships. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-01146-6.
format Online
Article
Text
id pubmed-8501573
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85015732021-10-20 Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships Zielezinski, Andrzej Barylski, Jakub Karlowski, Wojciech M. BMC Biol Methodology Article BACKGROUND: Characterizing phage–host interactions is critical to understanding the ecological role of both partners and effective isolation of phage therapeuticals. Unfortunately, experimental methods for studying these interactions are markedly slow, low-throughput, and unsuitable for phages or hosts difficult to maintain in laboratory conditions. Therefore, a number of in silico methods emerged to predict prokaryotic hosts based on viral sequences. One of the leading approaches is the application of the BLAST tool that searches for local similarities between viral and microbial genomes. However, this prediction method has three major limitations: (i) top-scoring sequences do not always point to the actual host; (ii) mosaic virus genomes may match to many, typically related, bacteria; and (iii) viral and host sequences may diverge beyond the point where their relationship can be detected by a BLAST alignment. RESULTS: We created an extension to BLAST, named Phirbo, that improves host prediction quality beyond what is obtainable from standard BLAST searches. The tool harnesses information concerning sequence similarity and bacteria relatedness to predict phage–host interactions. Phirbo was evaluated on three benchmark sets of known virus–host pairs, and it improved precision and recall by 11–40 percentage points over currently available, state-of-the-art, alignment-based, alignment-free, and machine-learning host prediction tools. Moreover, the discriminatory power of Phirbo for the recognition of virus–host relationships surpassed the results of other tools by at least 10 percentage points (area under the curve = 0.95), yielding a mean host prediction accuracy of 57% and 68% at the genus and family levels, respectively, and drops by 12 percentage points when using only a fraction of viral genome sequences (3 kb). Finally, we provide insights into a repertoire of protein and ncRNA genes that are shared between phages and hosts and may be prone to horizontal transfer during infection. CONCLUSIONS: Our results suggest that Phirbo is a simple and effective tool for predicting phage–host relationships. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12915-021-01146-6. BioMed Central 2021-10-08 /pmc/articles/PMC8501573/ /pubmed/34625070 http://dx.doi.org/10.1186/s12915-021-01146-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Zielezinski, Andrzej
Barylski, Jakub
Karlowski, Wojciech M.
Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title_full Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title_fullStr Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title_full_unstemmed Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title_short Taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
title_sort taxonomy-aware, sequence similarity ranking reliably predicts phage–host relationships
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8501573/
https://www.ncbi.nlm.nih.gov/pubmed/34625070
http://dx.doi.org/10.1186/s12915-021-01146-6
work_keys_str_mv AT zielezinskiandrzej taxonomyawaresequencesimilarityrankingreliablypredictsphagehostrelationships
AT barylskijakub taxonomyawaresequencesimilarityrankingreliablypredictsphagehostrelationships
AT karlowskiwojciechm taxonomyawaresequencesimilarityrankingreliablypredictsphagehostrelationships