Cargando…

DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data

For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and, as such, drives proteomics. MS/MS spectra need to be assigned a peptide sequence for which two strategies exist. Either database search or de novo sequencing can be employed to establish pept...

Descripción completa

Detalles Bibliográficos
Autores principales: Takan, Savaş, Allmer, Jens
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7585381/
https://www.ncbi.nlm.nih.gov/pubmed/33150092
http://dx.doi.org/10.7717/peerj.10216
_version_ 1783599777790820352
author Takan, Savaş
Allmer, Jens
author_facet Takan, Savaş
Allmer, Jens
author_sort Takan, Savaş
collection PubMed
description For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and, as such, drives proteomics. MS/MS spectra need to be assigned a peptide sequence for which two strategies exist. Either database search or de novo sequencing can be employed to establish peptide spectrum matches. For database search, mzIdentML is the current community standard for data representation. There is no community standard for representing de novo sequencing results, but we previously proposed the de novo markup language (DNML). At the moment, each de novo sequencing solution uses different data representation, complicating downstream data integration, which is crucial since ensemble predictions may be more useful than predictions of a single tool. We here propose the de novo MS Ontology (DNMSO), which can, for example, provide many-to-many mappings between spectra and peptide predictions. Additionally, an application programming interface (API) that supports any file operation necessary for de novo sequencing from spectra input to reading, writing, creating, of the DNMSO format, as well as conversion from many other file formats, has been implemented. This API removes all overhead from the production of de novo sequencing tools and allows developers to concentrate on algorithm development completely. We make the API and formal descriptions of the format freely available at https://github.com/savastakan/dnmso.
format Online
Article
Text
id pubmed-7585381
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-75853812020-11-03 DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data Takan, Savaş Allmer, Jens PeerJ Biochemistry For the identification and sequencing of proteins, mass spectrometry (MS) has become the tool of choice and, as such, drives proteomics. MS/MS spectra need to be assigned a peptide sequence for which two strategies exist. Either database search or de novo sequencing can be employed to establish peptide spectrum matches. For database search, mzIdentML is the current community standard for data representation. There is no community standard for representing de novo sequencing results, but we previously proposed the de novo markup language (DNML). At the moment, each de novo sequencing solution uses different data representation, complicating downstream data integration, which is crucial since ensemble predictions may be more useful than predictions of a single tool. We here propose the de novo MS Ontology (DNMSO), which can, for example, provide many-to-many mappings between spectra and peptide predictions. Additionally, an application programming interface (API) that supports any file operation necessary for de novo sequencing from spectra input to reading, writing, creating, of the DNMSO format, as well as conversion from many other file formats, has been implemented. This API removes all overhead from the production of de novo sequencing tools and allows developers to concentrate on algorithm development completely. We make the API and formal descriptions of the format freely available at https://github.com/savastakan/dnmso. PeerJ Inc. 2020-10-21 /pmc/articles/PMC7585381/ /pubmed/33150092 http://dx.doi.org/10.7717/peerj.10216 Text en ©2020 Takan and Allmer https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biochemistry
Takan, Savaş
Allmer, Jens
DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title_full DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title_fullStr DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title_full_unstemmed DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title_short DNMSO; an ontology for representing de novo sequencing results from Tandem-MS data
title_sort dnmso; an ontology for representing de novo sequencing results from tandem-ms data
topic Biochemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7585381/
https://www.ncbi.nlm.nih.gov/pubmed/33150092
http://dx.doi.org/10.7717/peerj.10216
work_keys_str_mv AT takansavas dnmsoanontologyforrepresentingdenovosequencingresultsfromtandemmsdata
AT allmerjens dnmsoanontologyforrepresentingdenovosequencingresultsfromtandemmsdata