Cargando…

Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns

Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To in...

Descripción completa

Detalles Bibliográficos
Autores principales: McEachran, Andrew D., Balabin, Ilya, Cathey, Tommy, Transue, Thomas R., Al-Ghoul, Hussein, Grulke, Chris, Sobus, Jon R., Williams, Antony J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6677792/
https://www.ncbi.nlm.nih.gov/pubmed/31375670
http://dx.doi.org/10.1038/s41597-019-0145-z
_version_ 1783440957358735360
author McEachran, Andrew D.
Balabin, Ilya
Cathey, Tommy
Transue, Thomas R.
Al-Ghoul, Hussein
Grulke, Chris
Sobus, Jon R.
Williams, Antony J.
author_facet McEachran, Andrew D.
Balabin, Ilya
Cathey, Tommy
Transue, Thomas R.
Al-Ghoul, Hussein
Grulke, Chris
Sobus, Jon R.
Williams, Antony J.
author_sort McEachran, Andrew D.
collection PubMed
description Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS(2)) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA’s CompTox Chemicals Dashboard.
format Online
Article
Text
id pubmed-6677792
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-66777922019-08-05 Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns McEachran, Andrew D. Balabin, Ilya Cathey, Tommy Transue, Thomas R. Al-Ghoul, Hussein Grulke, Chris Sobus, Jon R. Williams, Antony J. Sci Data Data Descriptor Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS(2)) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA’s CompTox Chemicals Dashboard. Nature Publishing Group UK 2019-08-02 /pmc/articles/PMC6677792/ /pubmed/31375670 http://dx.doi.org/10.1038/s41597-019-0145-z Text en © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.
spellingShingle Data Descriptor
McEachran, Andrew D.
Balabin, Ilya
Cathey, Tommy
Transue, Thomas R.
Al-Ghoul, Hussein
Grulke, Chris
Sobus, Jon R.
Williams, Antony J.
Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title_full Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title_fullStr Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title_full_unstemmed Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title_short Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
title_sort linking in silico ms/ms spectra with chemistry data to improve identification of unknowns
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6677792/
https://www.ncbi.nlm.nih.gov/pubmed/31375670
http://dx.doi.org/10.1038/s41597-019-0145-z
work_keys_str_mv AT mceachranandrewd linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT balabinilya linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT catheytommy linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT transuethomasr linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT alghoulhussein linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT grulkechris linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT sobusjonr linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns
AT williamsantonyj linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns