Cargando…

A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of...

Descripción completa

Detalles Bibliográficos
Autores principales: Afiahayati, Bernard, Stefanus, Gunadi, Wibawa, Hendra, Hakim, Mohamad Saifudin, Marcellus, Parikesit, Arli Aditya, Dewa, Chandra Kusuma, Sakakibara, Yasubumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9394340/
https://www.ncbi.nlm.nih.gov/pubmed/35893066
http://dx.doi.org/10.3390/genes13081330
_version_ 1784771468776701952
author Afiahayati,
Bernard, Stefanus
Gunadi,
Wibawa, Hendra
Hakim, Mohamad Saifudin
Marcellus,
Parikesit, Arli Aditya
Dewa, Chandra Kusuma
Sakakibara, Yasubumi
author_facet Afiahayati,
Bernard, Stefanus
Gunadi,
Wibawa, Hendra
Hakim, Mohamad Saifudin
Marcellus,
Parikesit, Arli Aditya
Dewa, Chandra Kusuma
Sakakibara, Yasubumi
author_sort Afiahayati,
collection PubMed
description Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full-length SARS-CoV-2 genome on the 10 January 2020, with the hope of turning the table against the worsening pandemic situation. Previous studies in respiratory virus characterization require mapping of raw sequences to the human genome in the downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines for improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on a hybridization capture method capable of capturing targeted respiratory viruses, including SARS-CoV-2; therefore, allowing a direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarking the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through the utilization of different bioinformatics pipelines named as ‘Fast Pipeline’ and ‘Normal Pipeline’ to SARS-CoV-2 strains isolated from Yogyakarta and Central Java, Indonesia. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples, including in the identification of major studied nucleotide substitutions and amino acid mutations. A higher number of reads mapped to the SARS-CoV-2 genome in Fast Pipeline and merely were discovered as a contributing factor in a higher number of coverage depth and identified variations (SNPs, insertion, and deletion). Fast Pipeline ultimately works well in a situation where time is a critical factor. On the other hand, Normal Pipeline would require a longer time as it mapped reads to the human genome. Certain limitations were identified in terms of pipeline algorithm, whereas it is highly recommended in future studies to design a pipeline in an integrated framework, for instance, by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline.
format Online
Article
Text
id pubmed-9394340
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-93943402022-08-23 A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains Afiahayati, Bernard, Stefanus Gunadi, Wibawa, Hendra Hakim, Mohamad Saifudin Marcellus, Parikesit, Arli Aditya Dewa, Chandra Kusuma Sakakibara, Yasubumi Genes (Basel) Article Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full-length SARS-CoV-2 genome on the 10 January 2020, with the hope of turning the table against the worsening pandemic situation. Previous studies in respiratory virus characterization require mapping of raw sequences to the human genome in the downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines for improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on a hybridization capture method capable of capturing targeted respiratory viruses, including SARS-CoV-2; therefore, allowing a direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarking the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through the utilization of different bioinformatics pipelines named as ‘Fast Pipeline’ and ‘Normal Pipeline’ to SARS-CoV-2 strains isolated from Yogyakarta and Central Java, Indonesia. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples, including in the identification of major studied nucleotide substitutions and amino acid mutations. A higher number of reads mapped to the SARS-CoV-2 genome in Fast Pipeline and merely were discovered as a contributing factor in a higher number of coverage depth and identified variations (SNPs, insertion, and deletion). Fast Pipeline ultimately works well in a situation where time is a critical factor. On the other hand, Normal Pipeline would require a longer time as it mapped reads to the human genome. Certain limitations were identified in terms of pipeline algorithm, whereas it is highly recommended in future studies to design a pipeline in an integrated framework, for instance, by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline. MDPI 2022-07-26 /pmc/articles/PMC9394340/ /pubmed/35893066 http://dx.doi.org/10.3390/genes13081330 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Afiahayati,
Bernard, Stefanus
Gunadi,
Wibawa, Hendra
Hakim, Mohamad Saifudin
Marcellus,
Parikesit, Arli Aditya
Dewa, Chandra Kusuma
Sakakibara, Yasubumi
A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title_full A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title_fullStr A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title_full_unstemmed A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title_short A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains
title_sort comparison of bioinformatics pipelines for enrichment illumina next generation sequencing systems in detecting sars-cov-2 virus strains
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9394340/
https://www.ncbi.nlm.nih.gov/pubmed/35893066
http://dx.doi.org/10.3390/genes13081330
work_keys_str_mv AT afiahayati acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT bernardstefanus acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT gunadi acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT wibawahendra acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT hakimmohamadsaifudin acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT marcellus acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT parikesitarliaditya acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT dewachandrakusuma acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT sakakibarayasubumi acomparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT afiahayati comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT bernardstefanus comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT gunadi comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT wibawahendra comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT hakimmohamadsaifudin comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT marcellus comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT parikesitarliaditya comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT dewachandrakusuma comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains
AT sakakibarayasubumi comparisonofbioinformaticspipelinesforenrichmentilluminanextgenerationsequencingsystemsindetectingsarscov2virusstrains