Cargando…
HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench
Next-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference g...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8398645/ https://www.ncbi.nlm.nih.gov/pubmed/34451490 http://dx.doi.org/10.3390/pathogens10081026 |
_version_ | 1783744889165447168 |
---|---|
author | Shen-Gunther, Jane Xia, Qingqing Cai, Hong Wang, Yufeng |
author_facet | Shen-Gunther, Jane Xia, Qingqing Cai, Hong Wang, Yufeng |
author_sort | Shen-Gunther, Jane |
collection | PubMed |
description | Next-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized papillomavirus database in the CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: (1) Taxonomic Analysis, (2) Estimate Alpha/Beta Diversities, and (3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 s); Map Reads (45 min). Tabular output conversion to visualizations entailed 1–2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research. |
format | Online Article Text |
id | pubmed-8398645 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83986452021-08-29 HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench Shen-Gunther, Jane Xia, Qingqing Cai, Hong Wang, Yufeng Pathogens Article Next-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized papillomavirus database in the CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: (1) Taxonomic Analysis, (2) Estimate Alpha/Beta Diversities, and (3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 s); Map Reads (45 min). Tabular output conversion to visualizations entailed 1–2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research. MDPI 2021-08-13 /pmc/articles/PMC8398645/ /pubmed/34451490 http://dx.doi.org/10.3390/pathogens10081026 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Shen-Gunther, Jane Xia, Qingqing Cai, Hong Wang, Yufeng HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title | HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title_full | HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title_fullStr | HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title_full_unstemmed | HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title_short | HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench |
title_sort | hpv deepseq: an ultra-fast method of ngs data analysis and visualization using automated workflows and a customized papillomavirus database in clc genomics workbench |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8398645/ https://www.ncbi.nlm.nih.gov/pubmed/34451490 http://dx.doi.org/10.3390/pathogens10081026 |
work_keys_str_mv | AT shenguntherjane hpvdeepseqanultrafastmethodofngsdataanalysisandvisualizationusingautomatedworkflowsandacustomizedpapillomavirusdatabaseinclcgenomicsworkbench AT xiaqingqing hpvdeepseqanultrafastmethodofngsdataanalysisandvisualizationusingautomatedworkflowsandacustomizedpapillomavirusdatabaseinclcgenomicsworkbench AT caihong hpvdeepseqanultrafastmethodofngsdataanalysisandvisualizationusingautomatedworkflowsandacustomizedpapillomavirusdatabaseinclcgenomicsworkbench AT wangyufeng hpvdeepseqanultrafastmethodofngsdataanalysisandvisualizationusingautomatedworkflowsandacustomizedpapillomavirusdatabaseinclcgenomicsworkbench |