Cargando…
Rapid discovery of novel prophages using biological feature engineering and machine learning
Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We pre...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787355/ https://www.ncbi.nlm.nih.gov/pubmed/33575651 http://dx.doi.org/10.1093/nargab/lqaa109 |
_version_ | 1783632807536361472 |
---|---|
author | Sirén, Kimmo Millard, Andrew Petersen, Bent Gilbert, M Thomas P Clokie, Martha R J Sicheritz-Pontén, Thomas |
author_facet | Sirén, Kimmo Millard, Andrew Petersen, Bent Gilbert, M Thomas P Clokie, Martha R J Sicheritz-Pontén, Thomas |
author_sort | Sirén, Kimmo |
collection | PubMed |
description | Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We present a novel, fast and generalizing machine learning method based on feature space to facilitate novel prophage discovery. To validate the approach, we reanalyzed publicly available marine viromes and single-cell genomes using our feature-based approaches and found consistently more phages than were detected using current state-of-the-art tools while being notably faster. This demonstrates that our approach significantly enhances bacteriophage discovery and thus provides a new starting point for exploring new biologies. |
format | Online Article Text |
id | pubmed-7787355 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77873552021-02-10 Rapid discovery of novel prophages using biological feature engineering and machine learning Sirén, Kimmo Millard, Andrew Petersen, Bent Gilbert, M Thomas P Clokie, Martha R J Sicheritz-Pontén, Thomas NAR Genom Bioinform Methods Article Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We present a novel, fast and generalizing machine learning method based on feature space to facilitate novel prophage discovery. To validate the approach, we reanalyzed publicly available marine viromes and single-cell genomes using our feature-based approaches and found consistently more phages than were detected using current state-of-the-art tools while being notably faster. This demonstrates that our approach significantly enhances bacteriophage discovery and thus provides a new starting point for exploring new biologies. Oxford University Press 2021-01-06 /pmc/articles/PMC7787355/ /pubmed/33575651 http://dx.doi.org/10.1093/nargab/lqaa109 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Article Sirén, Kimmo Millard, Andrew Petersen, Bent Gilbert, M Thomas P Clokie, Martha R J Sicheritz-Pontén, Thomas Rapid discovery of novel prophages using biological feature engineering and machine learning |
title | Rapid discovery of novel prophages using biological feature engineering and machine learning |
title_full | Rapid discovery of novel prophages using biological feature engineering and machine learning |
title_fullStr | Rapid discovery of novel prophages using biological feature engineering and machine learning |
title_full_unstemmed | Rapid discovery of novel prophages using biological feature engineering and machine learning |
title_short | Rapid discovery of novel prophages using biological feature engineering and machine learning |
title_sort | rapid discovery of novel prophages using biological feature engineering and machine learning |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787355/ https://www.ncbi.nlm.nih.gov/pubmed/33575651 http://dx.doi.org/10.1093/nargab/lqaa109 |
work_keys_str_mv | AT sirenkimmo rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning AT millardandrew rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning AT petersenbent rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning AT gilbertmthomasp rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning AT clokiemartharj rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning AT sicheritzpontenthomas rapiddiscoveryofnovelprophagesusingbiologicalfeatureengineeringandmachinelearning |