Cargando…

PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing

SUMMARY: De novo peptide sequencing for tandem mass spectrometry data is not only a key technology for novel peptide identification, but also a precedent task for many downstream tasks, such as vaccine and antibody studies. In recent years, neural network models for de novo peptide sequencing have m...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Xiaofang, Yang, Chunde, He, Qiang, Shu, Kunxian, Xinpu, Yuan, Chen, Zhiguang, Zhu, Yunping, Chen, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148685/
https://www.ncbi.nlm.nih.gov/pubmed/37128577
http://dx.doi.org/10.1093/bioadv/vbad057
_version_ 1785035026101960704
author Xu, Xiaofang
Yang, Chunde
He, Qiang
Shu, Kunxian
Xinpu, Yuan
Chen, Zhiguang
Zhu, Yunping
Chen, Tao
author_facet Xu, Xiaofang
Yang, Chunde
He, Qiang
Shu, Kunxian
Xinpu, Yuan
Chen, Zhiguang
Zhu, Yunping
Chen, Tao
author_sort Xu, Xiaofang
collection PubMed
description SUMMARY: De novo peptide sequencing for tandem mass spectrometry data is not only a key technology for novel peptide identification, but also a precedent task for many downstream tasks, such as vaccine and antibody studies. In recent years, neural network models for de novo peptide sequencing have manifested a remarkable ability to accommodate various data sources and outperformed conventional peptide identification tools. However, the excellent model is computationally expensive, taking up to 1 week to process about 400 000 spectrums. This article presents PGPointNovo, a novel neural network-based tool for parallel de novo peptide sequencing. PGPointNovo uses data parallelization technology to accelerate training and inference and optimizes the training obstacles caused by large batch sizes. The results of extensive experiments conducted on multiple datasets of different sizes demonstrate that compared with PointNovo the excellent neural network-based de novo peptide sequencing tool, PGPointNovo, accelerates de novo peptide sequencing by up to 7.35× without precision or recall compromises. AVAILABILITY AND IMPLEMENTATION: The source code and the parameter settings are available at https://github.com/shallFun4Learning/PGPointNovo. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-10148685
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101486852023-04-30 PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing Xu, Xiaofang Yang, Chunde He, Qiang Shu, Kunxian Xinpu, Yuan Chen, Zhiguang Zhu, Yunping Chen, Tao Bioinform Adv Application Note SUMMARY: De novo peptide sequencing for tandem mass spectrometry data is not only a key technology for novel peptide identification, but also a precedent task for many downstream tasks, such as vaccine and antibody studies. In recent years, neural network models for de novo peptide sequencing have manifested a remarkable ability to accommodate various data sources and outperformed conventional peptide identification tools. However, the excellent model is computationally expensive, taking up to 1 week to process about 400 000 spectrums. This article presents PGPointNovo, a novel neural network-based tool for parallel de novo peptide sequencing. PGPointNovo uses data parallelization technology to accelerate training and inference and optimizes the training obstacles caused by large batch sizes. The results of extensive experiments conducted on multiple datasets of different sizes demonstrate that compared with PointNovo the excellent neural network-based de novo peptide sequencing tool, PGPointNovo, accelerates de novo peptide sequencing by up to 7.35× without precision or recall compromises. AVAILABILITY AND IMPLEMENTATION: The source code and the parameter settings are available at https://github.com/shallFun4Learning/PGPointNovo. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-04-25 /pmc/articles/PMC10148685/ /pubmed/37128577 http://dx.doi.org/10.1093/bioadv/vbad057 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Application Note
Xu, Xiaofang
Yang, Chunde
He, Qiang
Shu, Kunxian
Xinpu, Yuan
Chen, Zhiguang
Zhu, Yunping
Chen, Tao
PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title_full PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title_fullStr PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title_full_unstemmed PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title_short PGPointNovo: an efficient neural network-based tool for parallel de novo peptide sequencing
title_sort pgpointnovo: an efficient neural network-based tool for parallel de novo peptide sequencing
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148685/
https://www.ncbi.nlm.nih.gov/pubmed/37128577
http://dx.doi.org/10.1093/bioadv/vbad057
work_keys_str_mv AT xuxiaofang pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT yangchunde pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT heqiang pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT shukunxian pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT xinpuyuan pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT chenzhiguang pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT zhuyunping pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing
AT chentao pgpointnovoanefficientneuralnetworkbasedtoolforparalleldenovopeptidesequencing