Cargando…

Novor: Real-Time Peptide de Novo Sequencing Software

De novo sequencing software has been widely used in proteomics to sequence new peptides from tandem mass spectrometry data. This study presents a new software tool, Novor, to greatly improve both the speed and accuracy of today’s peptide de novo sequencing analyses. To improve the accuracy, Novor’s...

Descripción completa

Detalles Bibliográficos
Autor principal: Ma, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4604512/
https://www.ncbi.nlm.nih.gov/pubmed/26122521
http://dx.doi.org/10.1007/s13361-015-1204-0
_version_ 1782395068568567808
author Ma, Bin
author_facet Ma, Bin
author_sort Ma, Bin
collection PubMed
description De novo sequencing software has been widely used in proteomics to sequence new peptides from tandem mass spectrometry data. This study presents a new software tool, Novor, to greatly improve both the speed and accuracy of today’s peptide de novo sequencing analyses. To improve the accuracy, Novor’s scoring functions are based on two large decision trees built from a peptide spectral library with more than 300,000 spectra with machine learning. Important knowledge about peptide fragmentation is extracted automatically from the library and incorporated into the scoring functions. The decision tree model also enables efficient score calculation and contributes to the speed improvement. To further improve the speed, a two-stage algorithmic approach, namely dynamic programming and refinement, is used. The software program was also carefully optimized. On the testing datasets, Novor sequenced 7%–37% more correct residues than the state-of-the-art de novo sequencing tool, PEAKS, while being an order of magnitude faster. Novor can de novo sequence more than 300 MS/MS spectra per second on a laptop computer. The speed surpasses the acquisition speed of today’s mass spectrometer and, therefore, opens a new possibility to de novo sequence in real time while the spectrometer is acquiring the spectral data. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s13361-015-1204-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4604512
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-46045122015-10-19 Novor: Real-Time Peptide de Novo Sequencing Software Ma, Bin J Am Soc Mass Spectrom Focus: 20 Year Anniversary of SEQUEST: Research Article De novo sequencing software has been widely used in proteomics to sequence new peptides from tandem mass spectrometry data. This study presents a new software tool, Novor, to greatly improve both the speed and accuracy of today’s peptide de novo sequencing analyses. To improve the accuracy, Novor’s scoring functions are based on two large decision trees built from a peptide spectral library with more than 300,000 spectra with machine learning. Important knowledge about peptide fragmentation is extracted automatically from the library and incorporated into the scoring functions. The decision tree model also enables efficient score calculation and contributes to the speed improvement. To further improve the speed, a two-stage algorithmic approach, namely dynamic programming and refinement, is used. The software program was also carefully optimized. On the testing datasets, Novor sequenced 7%–37% more correct residues than the state-of-the-art de novo sequencing tool, PEAKS, while being an order of magnitude faster. Novor can de novo sequence more than 300 MS/MS spectra per second on a laptop computer. The speed surpasses the acquisition speed of today’s mass spectrometer and, therefore, opens a new possibility to de novo sequence in real time while the spectrometer is acquiring the spectral data. [Figure: see text] ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s13361-015-1204-0) contains supplementary material, which is available to authorized users. Springer US 2015-06-30 2015 /pmc/articles/PMC4604512/ /pubmed/26122521 http://dx.doi.org/10.1007/s13361-015-1204-0 Text en © The Author(s) 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Focus: 20 Year Anniversary of SEQUEST: Research Article
Ma, Bin
Novor: Real-Time Peptide de Novo Sequencing Software
title Novor: Real-Time Peptide de Novo Sequencing Software
title_full Novor: Real-Time Peptide de Novo Sequencing Software
title_fullStr Novor: Real-Time Peptide de Novo Sequencing Software
title_full_unstemmed Novor: Real-Time Peptide de Novo Sequencing Software
title_short Novor: Real-Time Peptide de Novo Sequencing Software
title_sort novor: real-time peptide de novo sequencing software
topic Focus: 20 Year Anniversary of SEQUEST: Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4604512/
https://www.ncbi.nlm.nih.gov/pubmed/26122521
http://dx.doi.org/10.1007/s13361-015-1204-0
work_keys_str_mv AT mabin novorrealtimepeptidedenovosequencingsoftware