Cargando…

Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing

With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity...

Descripción completa

Detalles Bibliográficos
Autores principales: Tao, Ye, Xun, Fan, Zhao, Cheng, Mao, Zhendu, Li, Biao, Xing, Peng, Wu, Qinglong L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9927493/
https://www.ncbi.nlm.nih.gov/pubmed/36475839
http://dx.doi.org/10.1128/spectrum.03328-22
_version_ 1784888487763247104
author Tao, Ye
Xun, Fan
Zhao, Cheng
Mao, Zhendu
Li, Biao
Xing, Peng
Wu, Qinglong L.
author_facet Tao, Ye
Xun, Fan
Zhao, Cheng
Mao, Zhendu
Li, Biao
Xing, Peng
Wu, Qinglong L.
author_sort Tao, Ye
collection PubMed
description With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as ‘NGS’, ‘Hybrid (NGS+HiFi)’, and ‘HiFi’. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the ‘HiFi’ approach to assemble high-quality microbial genomes. Among the 3 strategies, the ‘Hybrid’ approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the ‘HiFi’ assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the ‘Hybrid’ approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the ‘Hybrid’ assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research.
format Online
Article
Text
id pubmed-9927493
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-99274932023-02-15 Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing Tao, Ye Xun, Fan Zhao, Cheng Mao, Zhendu Li, Biao Xing, Peng Wu, Qinglong L. Microbiol Spectr Research Article With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as ‘NGS’, ‘Hybrid (NGS+HiFi)’, and ‘HiFi’. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the ‘HiFi’ approach to assemble high-quality microbial genomes. Among the 3 strategies, the ‘Hybrid’ approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the ‘HiFi’ assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the ‘Hybrid’ approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the ‘Hybrid’ assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research. American Society for Microbiology 2022-12-08 /pmc/articles/PMC9927493/ /pubmed/36475839 http://dx.doi.org/10.1128/spectrum.03328-22 Text en Copyright © 2022 Tao et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Tao, Ye
Xun, Fan
Zhao, Cheng
Mao, Zhendu
Li, Biao
Xing, Peng
Wu, Qinglong L.
Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title_full Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title_fullStr Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title_full_unstemmed Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title_short Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing
title_sort improved assembly of metagenome-assembled genomes and viruses in tibetan saline lake sediment by hifi metagenomic sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9927493/
https://www.ncbi.nlm.nih.gov/pubmed/36475839
http://dx.doi.org/10.1128/spectrum.03328-22
work_keys_str_mv AT taoye improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT xunfan improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT zhaocheng improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT maozhendu improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT libiao improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT xingpeng improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing
AT wuqinglongl improvedassemblyofmetagenomeassembledgenomesandvirusesintibetansalinelakesedimentbyhifimetagenomicsequencing