Cargando…

Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp

Analyzing whole-genome bisulfite and related sequencing datasets is a time-intensive process due to the complexity and size of the input raw sequencing files and lengthy read alignment step requiring correction for conversion of all unmethylated Cs to Ts genome-wide. The objective of this study was...

Descripción completa

Detalles Bibliográficos
Autores principales: Lehle, Jake D, McCarrey, John R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329742/
https://www.ncbi.nlm.nih.gov/pubmed/37431446
http://dx.doi.org/10.1093/biomethods/bpad012
_version_ 1785070084109107200
author Lehle, Jake D
McCarrey, John R
author_facet Lehle, Jake D
McCarrey, John R
author_sort Lehle, Jake D
collection PubMed
description Analyzing whole-genome bisulfite and related sequencing datasets is a time-intensive process due to the complexity and size of the input raw sequencing files and lengthy read alignment step requiring correction for conversion of all unmethylated Cs to Ts genome-wide. The objective of this study was to modify the read alignment algorithm associated with the whole-genome bisulfite sequencing methylation analysis pipeline (wg-blimp) to shorten the time required to complete this phase while retaining overall read alignment accuracy. Here, we report an update to the recently published pipeline wg-blimp achieved by replacing the use of the bwa-meth aligner with the faster gemBS aligner. This improvement to the wg-blimp pipeline has led to a more than ×7 acceleration in the processing speed of samples when scaled to larger publicly available FASTQ datasets containing 80–160 million reads while maintaining nearly identical accuracy of properly mapped reads when compared with data from the previous pipeline. The modifications to the wg-blimp pipeline reported here merge the speed and accuracy of the gemBS aligner with the comprehensive analysis and data visualization assets of the wg-blimp pipeline to provide a significantly accelerated workflow that can produce high-quality data much more rapidly without compromising read accuracy at the expense of increasing RAM requirements up to 48 GB.
format Online
Article
Text
id pubmed-10329742
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103297422023-07-10 Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp Lehle, Jake D McCarrey, John R Biol Methods Protoc Innovations Analyzing whole-genome bisulfite and related sequencing datasets is a time-intensive process due to the complexity and size of the input raw sequencing files and lengthy read alignment step requiring correction for conversion of all unmethylated Cs to Ts genome-wide. The objective of this study was to modify the read alignment algorithm associated with the whole-genome bisulfite sequencing methylation analysis pipeline (wg-blimp) to shorten the time required to complete this phase while retaining overall read alignment accuracy. Here, we report an update to the recently published pipeline wg-blimp achieved by replacing the use of the bwa-meth aligner with the faster gemBS aligner. This improvement to the wg-blimp pipeline has led to a more than ×7 acceleration in the processing speed of samples when scaled to larger publicly available FASTQ datasets containing 80–160 million reads while maintaining nearly identical accuracy of properly mapped reads when compared with data from the previous pipeline. The modifications to the wg-blimp pipeline reported here merge the speed and accuracy of the gemBS aligner with the comprehensive analysis and data visualization assets of the wg-blimp pipeline to provide a significantly accelerated workflow that can produce high-quality data much more rapidly without compromising read accuracy at the expense of increasing RAM requirements up to 48 GB. Oxford University Press 2023-06-27 /pmc/articles/PMC10329742/ /pubmed/37431446 http://dx.doi.org/10.1093/biomethods/bpad012 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Innovations
Lehle, Jake D
McCarrey, John R
Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title_full Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title_fullStr Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title_full_unstemmed Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title_short Accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
title_sort accelerating the alignment processing speed of the comprehensive end-to-end whole-genome bisulfite sequencing pipeline, wg-blimp
topic Innovations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10329742/
https://www.ncbi.nlm.nih.gov/pubmed/37431446
http://dx.doi.org/10.1093/biomethods/bpad012
work_keys_str_mv AT lehlejaked acceleratingthealignmentprocessingspeedofthecomprehensiveendtoendwholegenomebisulfitesequencingpipelinewgblimp
AT mccarreyjohnr acceleratingthealignmentprocessingspeedofthecomprehensiveendtoendwholegenomebisulfitesequencingpipelinewgblimp