Cargando…
Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data
The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species vi...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040722/ https://www.ncbi.nlm.nih.gov/pubmed/35404122 http://dx.doi.org/10.1128/mbio.02444-21 |
_version_ | 1784694393796558848 |
---|---|
author | Hu, Yiheng Irinyi, Laszlo Hoang, Minh Thuy Vi Eenjes, Tavish Graetz, Abigail Stone, Eric A. Meyer, Wieland Schwessinger, Benjamin Rathjen, John P. |
author_facet | Hu, Yiheng Irinyi, Laszlo Hoang, Minh Thuy Vi Eenjes, Tavish Graetz, Abigail Stone, Eric A. Meyer, Wieland Schwessinger, Benjamin Rathjen, John P. |
author_sort | Hu, Yiheng |
collection | PubMed |
description | The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species via high-throughput sequencing is increasingly becoming the norm for pathogen detection, microbiome studies, and environmental monitoring. With the rapid development of sequencing technologies, however, standardized procedures for taxonomic assignment of long sequence reads have not yet been well established. Focusing on nanopore sequencing technology, we compared classification and community composition analysis pipelines using shotgun and amplicon sequencing data generated from mock communities comprising 43 fungal species. We show that regardless of the sequencing methodology used, the highest accuracy of species identification was achieved by sequence alignment against a fungal-specific database. During the assessment of classification algorithms, we found that applying cutoffs to the query coverage of each read or contig significantly improved the classification accuracy and community composition analysis without major data loss. We also generated draft genome assemblies for three fungal species from nanopore data which were absent from genome databases. Our study improves sequence-based classification and estimation of relative sequence abundance using real fungal community data and provides a practical guide for the design of metagenomics analyses focusing on fungi. |
format | Online Article Text |
id | pubmed-9040722 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-90407222022-04-27 Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data Hu, Yiheng Irinyi, Laszlo Hoang, Minh Thuy Vi Eenjes, Tavish Graetz, Abigail Stone, Eric A. Meyer, Wieland Schwessinger, Benjamin Rathjen, John P. mBio Research Article The kingdom Fungi is highly diverse in morphology and ecosystem function. Yet fungi are challenging to characterize as they can be difficult to culture and morphologically indistinct. Overall, their description and analysis lag far behind other microbes such as bacteria. Classification of species via high-throughput sequencing is increasingly becoming the norm for pathogen detection, microbiome studies, and environmental monitoring. With the rapid development of sequencing technologies, however, standardized procedures for taxonomic assignment of long sequence reads have not yet been well established. Focusing on nanopore sequencing technology, we compared classification and community composition analysis pipelines using shotgun and amplicon sequencing data generated from mock communities comprising 43 fungal species. We show that regardless of the sequencing methodology used, the highest accuracy of species identification was achieved by sequence alignment against a fungal-specific database. During the assessment of classification algorithms, we found that applying cutoffs to the query coverage of each read or contig significantly improved the classification accuracy and community composition analysis without major data loss. We also generated draft genome assemblies for three fungal species from nanopore data which were absent from genome databases. Our study improves sequence-based classification and estimation of relative sequence abundance using real fungal community data and provides a practical guide for the design of metagenomics analyses focusing on fungi. American Society for Microbiology 2022-04-11 /pmc/articles/PMC9040722/ /pubmed/35404122 http://dx.doi.org/10.1128/mbio.02444-21 Text en Copyright © 2022 Hu et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Research Article Hu, Yiheng Irinyi, Laszlo Hoang, Minh Thuy Vi Eenjes, Tavish Graetz, Abigail Stone, Eric A. Meyer, Wieland Schwessinger, Benjamin Rathjen, John P. Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title | Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title_full | Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title_fullStr | Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title_full_unstemmed | Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title_short | Inferring Species Compositions of Complex Fungal Communities from Long- and Short-Read Sequence Data |
title_sort | inferring species compositions of complex fungal communities from long- and short-read sequence data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040722/ https://www.ncbi.nlm.nih.gov/pubmed/35404122 http://dx.doi.org/10.1128/mbio.02444-21 |
work_keys_str_mv | AT huyiheng inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT irinyilaszlo inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT hoangminhthuyvi inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT eenjestavish inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT graetzabigail inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT stoneerica inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT meyerwieland inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT schwessingerbenjamin inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata AT rathjenjohnp inferringspeciescompositionsofcomplexfungalcommunitiesfromlongandshortreadsequencedata |