Cargando…

Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing

Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordabili...

Descripción completa

Detalles Bibliográficos
Autores principales: Langsiri, Nattapong, Worasilchai, Navaporn, Irinyi, Laszlo, Jenjaroenpun, Piroon, Wongsurawat, Thidathip, Luangsa-ard, Janet Jennifer, Meyer, Wieland, Chindamporn, Ariya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483712/
https://www.ncbi.nlm.nih.gov/pubmed/37674240
http://dx.doi.org/10.1186/s43008-023-00125-6
_version_ 1785102440447606784
author Langsiri, Nattapong
Worasilchai, Navaporn
Irinyi, Laszlo
Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Luangsa-ard, Janet Jennifer
Meyer, Wieland
Chindamporn, Ariya
author_facet Langsiri, Nattapong
Worasilchai, Navaporn
Irinyi, Laszlo
Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Luangsa-ard, Janet Jennifer
Meyer, Wieland
Chindamporn, Ariya
author_sort Langsiri, Nattapong
collection PubMed
description Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s43008-023-00125-6.
format Online
Article
Text
id pubmed-10483712
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104837122023-09-08 Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing Langsiri, Nattapong Worasilchai, Navaporn Irinyi, Laszlo Jenjaroenpun, Piroon Wongsurawat, Thidathip Luangsa-ard, Janet Jennifer Meyer, Wieland Chindamporn, Ariya IMA Fungus Research Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s43008-023-00125-6. BioMed Central 2023-09-06 /pmc/articles/PMC10483712/ /pubmed/37674240 http://dx.doi.org/10.1186/s43008-023-00125-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research
Langsiri, Nattapong
Worasilchai, Navaporn
Irinyi, Laszlo
Jenjaroenpun, Piroon
Wongsurawat, Thidathip
Luangsa-ard, Janet Jennifer
Meyer, Wieland
Chindamporn, Ariya
Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_fullStr Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full_unstemmed Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_short Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_sort targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483712/
https://www.ncbi.nlm.nih.gov/pubmed/37674240
http://dx.doi.org/10.1186/s43008-023-00125-6
work_keys_str_mv AT langsirinattapong targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT worasilchainavaporn targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT irinyilaszlo targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT jenjaroenpunpiroon targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT wongsurawatthidathip targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT luangsaardjanetjennifer targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT meyerwieland targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing
AT chindampornariya targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing