Cargando…

A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity

Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5–6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conduc...

Descripción completa

Detalles Bibliográficos
Autores principales: Gong, Yu-Nong, Chen, Guang-Wu, Yang, Shu-Li, Lee, Ching-Ju, Shih, Shin-Ru, Tsao, Kuo-Chien
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795770/
https://www.ncbi.nlm.nih.gov/pubmed/26986479
http://dx.doi.org/10.1371/journal.pone.0151495
_version_ 1782421662605508608
author Gong, Yu-Nong
Chen, Guang-Wu
Yang, Shu-Li
Lee, Ching-Ju
Shih, Shin-Ru
Tsao, Kuo-Chien
author_facet Gong, Yu-Nong
Chen, Guang-Wu
Yang, Shu-Li
Lee, Ching-Ju
Shih, Shin-Ru
Tsao, Kuo-Chien
author_sort Gong, Yu-Nong
collection PubMed
description Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5–6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA) was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV) in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs), 10 HPeVs, 1 human adenovirus (HAdV), 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.
format Online
Article
Text
id pubmed-4795770
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-47957702016-03-23 A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity Gong, Yu-Nong Chen, Guang-Wu Yang, Shu-Li Lee, Ching-Ju Shih, Shin-Ru Tsao, Kuo-Chien PLoS One Research Article Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5–6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA) was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV) in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs), 10 HPeVs, 1 human adenovirus (HAdV), 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution. Public Library of Science 2016-03-17 /pmc/articles/PMC4795770/ /pubmed/26986479 http://dx.doi.org/10.1371/journal.pone.0151495 Text en © 2016 Gong et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gong, Yu-Nong
Chen, Guang-Wu
Yang, Shu-Li
Lee, Ching-Ju
Shih, Shin-Ru
Tsao, Kuo-Chien
A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title_full A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title_fullStr A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title_full_unstemmed A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title_short A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity
title_sort next-generation sequencing data analysis pipeline for detecting unknown pathogens from mixed clinical samples and revealing their genetic diversity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795770/
https://www.ncbi.nlm.nih.gov/pubmed/26986479
http://dx.doi.org/10.1371/journal.pone.0151495
work_keys_str_mv AT gongyunong anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT chenguangwu anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT yangshuli anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT leechingju anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shihshinru anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT tsaokuochien anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT gongyunong nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT chenguangwu nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT yangshuli nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT leechingju nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shihshinru nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT tsaokuochien nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity