Cargando…

SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis

Tandem mass spectrometry (MS/MS)-based de novo peptide sequencing is a powerful method for high-throughput protein analysis. However, the explosively increasing size of MS/MS spectra dataset inevitably and exponentially raises the computational demand of existing de novo peptide sequencing methods,...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Chuang, Li, Kenli, Li, Keqin, Xie, Xianghui, Lin, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Ivyspring International Publisher 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6743289/
https://www.ncbi.nlm.nih.gov/pubmed/31523183
http://dx.doi.org/10.7150/ijbs.32142
_version_ 1783451258243252224
author Li, Chuang
Li, Kenli
Li, Keqin
Xie, Xianghui
Lin, Feng
author_facet Li, Chuang
Li, Kenli
Li, Keqin
Xie, Xianghui
Lin, Feng
author_sort Li, Chuang
collection PubMed
description Tandem mass spectrometry (MS/MS)-based de novo peptide sequencing is a powerful method for high-throughput protein analysis. However, the explosively increasing size of MS/MS spectra dataset inevitably and exponentially raises the computational demand of existing de novo peptide sequencing methods, which is an issue urgently to be solved in computational biology. This paper introduces an efficient tool based on SW26010 many-core processor, namely SWPepNovo, to process the large-scale peptide MS/MS spectra using a parallel peptide spectrum matches (PSMs) algorithm. Our design employs a two-level parallelization mechanism: (1) the task-level parallelism between MPEs using MPI based on a data transformation method and a dynamic feedback task scheduling algorithm, (2) the thread-level parallelism across CPEs using asynchronous task transfer and multithreading. Moreover, three optimization strategies, including vectorization, double buffering and memory access optimizations, have been employed to overcome both the compute-bound and the memory-bound bottlenecks in the parallel PSMs algorithm. The results of experiments conducted on multiple spectra datasets demonstrate the performance of SWPepNovo against three state-of-the-art tools for peptide sequencing, including PepNovo+, PEAKS and DeepNovo-DIA. The SWPepNovo also shows high scalability in experiments on extremely large datasets sized up to 11.22 GB. The software and the parameter settings are available at https://github.com/ChuangLi99/SWPepNovo.
format Online
Article
Text
id pubmed-6743289
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Ivyspring International Publisher
record_format MEDLINE/PubMed
spelling pubmed-67432892019-09-14 SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis Li, Chuang Li, Kenli Li, Keqin Xie, Xianghui Lin, Feng Int J Biol Sci Research Paper Tandem mass spectrometry (MS/MS)-based de novo peptide sequencing is a powerful method for high-throughput protein analysis. However, the explosively increasing size of MS/MS spectra dataset inevitably and exponentially raises the computational demand of existing de novo peptide sequencing methods, which is an issue urgently to be solved in computational biology. This paper introduces an efficient tool based on SW26010 many-core processor, namely SWPepNovo, to process the large-scale peptide MS/MS spectra using a parallel peptide spectrum matches (PSMs) algorithm. Our design employs a two-level parallelization mechanism: (1) the task-level parallelism between MPEs using MPI based on a data transformation method and a dynamic feedback task scheduling algorithm, (2) the thread-level parallelism across CPEs using asynchronous task transfer and multithreading. Moreover, three optimization strategies, including vectorization, double buffering and memory access optimizations, have been employed to overcome both the compute-bound and the memory-bound bottlenecks in the parallel PSMs algorithm. The results of experiments conducted on multiple spectra datasets demonstrate the performance of SWPepNovo against three state-of-the-art tools for peptide sequencing, including PepNovo+, PEAKS and DeepNovo-DIA. The SWPepNovo also shows high scalability in experiments on extremely large datasets sized up to 11.22 GB. The software and the parameter settings are available at https://github.com/ChuangLi99/SWPepNovo. Ivyspring International Publisher 2019-07-03 /pmc/articles/PMC6743289/ /pubmed/31523183 http://dx.doi.org/10.7150/ijbs.32142 Text en © The author(s) This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/). See http://ivyspring.com/terms for full terms and conditions.
spellingShingle Research Paper
Li, Chuang
Li, Kenli
Li, Keqin
Xie, Xianghui
Lin, Feng
SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title_full SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title_fullStr SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title_full_unstemmed SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title_short SWPepNovo: An Efficient De Novo Peptide Sequencing Tool for Large-scale MS/MS Spectra Analysis
title_sort swpepnovo: an efficient de novo peptide sequencing tool for large-scale ms/ms spectra analysis
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6743289/
https://www.ncbi.nlm.nih.gov/pubmed/31523183
http://dx.doi.org/10.7150/ijbs.32142
work_keys_str_mv AT lichuang swpepnovoanefficientdenovopeptidesequencingtoolforlargescalemsmsspectraanalysis
AT likenli swpepnovoanefficientdenovopeptidesequencingtoolforlargescalemsmsspectraanalysis
AT likeqin swpepnovoanefficientdenovopeptidesequencingtoolforlargescalemsmsspectraanalysis
AT xiexianghui swpepnovoanefficientdenovopeptidesequencingtoolforlargescalemsmsspectraanalysis
AT linfeng swpepnovoanefficientdenovopeptidesequencingtoolforlargescalemsmsspectraanalysis