Cargando…
Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples
Metagenomes can be considered as mixtures of viral, bacterial, and other eukaryotic DNA sequences. Mining viral sequences from metagenomes could shed insight into virus–host relationships and expand viral databases. Current alignment-based methods are unsuitable for identifying viral sequences from...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8175635/ https://www.ncbi.nlm.nih.gov/pubmed/34093479 http://dx.doi.org/10.3389/fmicb.2021.664560 |
_version_ | 1783703084033114112 |
---|---|
author | Song, Kai |
author_facet | Song, Kai |
author_sort | Song, Kai |
collection | PubMed |
description | Metagenomes can be considered as mixtures of viral, bacterial, and other eukaryotic DNA sequences. Mining viral sequences from metagenomes could shed insight into virus–host relationships and expand viral databases. Current alignment-based methods are unsuitable for identifying viral sequences from metagenome sequences because most assembled metagenomic contigs are short and possess few or no predicted genes, and most metagenomic viral genes are dissimilar to known viral genes. In this study, I developed a Markov model-based method, VirMC, to identify viral sequences from metagenomic data. VirMC uses Markov chains to model sequence signatures and construct a scoring model using a likelihood test to distinguish viral and bacterial sequences. Compared with the other two state-of-the-art viral sequence-prediction methods, VirFinder and PPR-Meta, my proposed method outperformed VirFinder and had similar performance with PPR-Meta for short contigs with length less than 400 bp. VirMC outperformed VirFinder and PPR-Meta for identifying viral sequences in contaminated metagenomic samples with eukaryotic sequences. VirMC showed better performance in assembling viral-genome sequences from metagenomic data (based on filtering potential bacterial reads). Applying VirMC to human gut metagenomes from healthy subjects and patients with type-2 diabetes (T2D) revealed that viral contigs could help classify healthy and diseased statuses. This alignment-free method complements gene-based alignment approaches and will significantly improve the precision of viral sequence identification. |
format | Online Article Text |
id | pubmed-8175635 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-81756352021-06-05 Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples Song, Kai Front Microbiol Microbiology Metagenomes can be considered as mixtures of viral, bacterial, and other eukaryotic DNA sequences. Mining viral sequences from metagenomes could shed insight into virus–host relationships and expand viral databases. Current alignment-based methods are unsuitable for identifying viral sequences from metagenome sequences because most assembled metagenomic contigs are short and possess few or no predicted genes, and most metagenomic viral genes are dissimilar to known viral genes. In this study, I developed a Markov model-based method, VirMC, to identify viral sequences from metagenomic data. VirMC uses Markov chains to model sequence signatures and construct a scoring model using a likelihood test to distinguish viral and bacterial sequences. Compared with the other two state-of-the-art viral sequence-prediction methods, VirFinder and PPR-Meta, my proposed method outperformed VirFinder and had similar performance with PPR-Meta for short contigs with length less than 400 bp. VirMC outperformed VirFinder and PPR-Meta for identifying viral sequences in contaminated metagenomic samples with eukaryotic sequences. VirMC showed better performance in assembling viral-genome sequences from metagenomic data (based on filtering potential bacterial reads). Applying VirMC to human gut metagenomes from healthy subjects and patients with type-2 diabetes (T2D) revealed that viral contigs could help classify healthy and diseased statuses. This alignment-free method complements gene-based alignment approaches and will significantly improve the precision of viral sequence identification. Frontiers Media S.A. 2021-05-21 /pmc/articles/PMC8175635/ /pubmed/34093479 http://dx.doi.org/10.3389/fmicb.2021.664560 Text en Copyright © 2021 Song. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Song, Kai Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title | Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title_full | Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title_fullStr | Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title_full_unstemmed | Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title_short | Reads Binning Improves the Assembly of Viral Genome Sequences From Metagenomic Samples |
title_sort | reads binning improves the assembly of viral genome sequences from metagenomic samples |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8175635/ https://www.ncbi.nlm.nih.gov/pubmed/34093479 http://dx.doi.org/10.3389/fmicb.2021.664560 |
work_keys_str_mv | AT songkai readsbinningimprovestheassemblyofviralgenomesequencesfrommetagenomicsamples |