Cargando…

Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data

The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species vi...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Yiheng, Irinyi, Laszlo, Hoang, Minh Thuy Vi, Eenjes, Tavish, Graetz, Abigail, Stone, Eric A., Meyer, Wieland, Schwessinger, Benjamin, Rathjen, John P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040722/
https://www.ncbi.nlm.nih.gov/pubmed/35404122
http://dx.doi.org/10.1128/mbio.02444-21
_version_ 1784694393796558848
author Hu, Yiheng
Irinyi, Laszlo
Hoang, Minh Thuy Vi
Eenjes, Tavish
Graetz, Abigail
Stone, Eric A.
Meyer, Wieland
Schwessinger, Benjamin
Rathjen, John P.
author_facet Hu, Yiheng
Irinyi, Laszlo
Hoang, Minh Thuy Vi
Eenjes, Tavish
Graetz, Abigail
Stone, Eric A.
Meyer, Wieland
Schwessinger, Benjamin
Rathjen, John P.
author_sort Hu, Yiheng
collection PubMed
description The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species via high-throughput sequencing is increasingly becoming the norm for pathogen detection, microbiome studies, and environmental monitoring. With the rapid development of sequencing technologies, however, standardized procedures for taxonomic assignment of long sequence reads have not yet been well established. Focusing on nanopore sequencing technology, we compared classification and community composition analysis pipelines using shotgun and amplicon sequencing data generated from mock communities comprising 43 fungal species. We show that regardless of the sequencing methodology used, the highest accuracy of species identification was achieved by sequence alignment against a fungal-specific database. During the assessment of classification algorithms, we found that applying cutoffs to the query coverage of each read or contig significantly improved the classification accuracy and community composition analysis without major data loss. We also generated draft genome assemblies for three fungal species from nanopore data which were absent from genome databases. Our study improves sequence-based classification and estimation of relative sequence abundance using real fungal community data and provides a practical guide for the design of metagenomics analyses focusing on fungi.
format Online
Article
Text
id pubmed-9040722
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-90407222022-04-27 Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data Hu, Yiheng Irinyi, Laszlo Hoang, Minh Thuy Vi Eenjes, Tavish Graetz, Abigail Stone, Eric A. Meyer, Wieland Schwessinger, Benjamin Rathjen, John P. mBio Research Article The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species via high-throughput sequencing is increasingly becoming the norm for pathogen detection, microbiome studies, and environmental monitoring. With the rapid development of sequencing technologies, however, standardized procedures for taxonomic assignment of long sequence reads have not yet been well established. Focusing on nanopore sequencing technology, we compared classification and community composition analysis pipelines using shotgun and amplicon sequencing data generated from mock communities comprising 43 fungal species. We show that regardless of the sequencing methodology used, the highest accuracy of species identification was achieved by sequence alignment against a fungal-specific database. During the assessment of classification algorithms, we found that applying cutoffs to the query coverage of each read or contig significantly improved the classification accuracy and community composition analysis without major data loss. We also generated draft genome assemblies for three fungal species from nanopore data which were absent from genome databases. Our study improves sequence-based classification and estimation of relative sequence abundance using real fungal community data and provides a practical guide for the design of metagenomics analyses focusing on fungi. American Society for Microbiology 2022-04-11 /pmc/articles/PMC9040722/ /pubmed/35404122 http://dx.doi.org/10.1128/mbio.02444-21 Text en Copyright © 2022 Hu et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Hu, Yiheng
Irinyi, Laszlo
Hoang, Minh Thuy Vi
Eenjes, Tavish
Graetz, Abigail
Stone, Eric A.
Meyer, Wieland
Schwessinger, Benjamin
Rathjen, John P.
Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title_full Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title_fullStr Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title_full_unstemmed Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title_short Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
title_sort inferring species compositions of complex fungal communities from long- and short-read sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040722/
https://www.ncbi.nlm.nih.gov/pubmed/35404122
http://dx.doi.org/10.1128/mbio.02444-21
work_keys_str_mv AT huyiheng inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT irinyilaszlo inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT hoangminhthuyvi inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT eenjestavish inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT graetzabigail inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT stoneerica inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT meyerwieland inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT schwessingerbenjamin inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata
AT rathjenjohnp inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata