Cargando…
Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To in...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6677792/ https://www.ncbi.nlm.nih.gov/pubmed/31375670 http://dx.doi.org/10.1038/s41597-019-0145-z |
_version_ | 1783440957358735360 |
---|---|
author | McEachran, Andrew D. Balabin, Ilya Cathey, Tommy Transue, Thomas R. Al-Ghoul, Hussein Grulke, Chris Sobus, Jon R. Williams, Antony J. |
author_facet | McEachran, Andrew D. Balabin, Ilya Cathey, Tommy Transue, Thomas R. Al-Ghoul, Hussein Grulke, Chris Sobus, Jon R. Williams, Antony J. |
author_sort | McEachran, Andrew D. |
collection | PubMed |
description | Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS(2)) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA’s CompTox Chemicals Dashboard. |
format | Online Article Text |
id | pubmed-6677792 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-66777922019-08-05 Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns McEachran, Andrew D. Balabin, Ilya Cathey, Tommy Transue, Thomas R. Al-Ghoul, Hussein Grulke, Chris Sobus, Jon R. Williams, Antony J. Sci Data Data Descriptor Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS(2)) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA’s CompTox Chemicals Dashboard. Nature Publishing Group UK 2019-08-02 /pmc/articles/PMC6677792/ /pubmed/31375670 http://dx.doi.org/10.1038/s41597-019-0145-z Text en © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article. |
spellingShingle | Data Descriptor McEachran, Andrew D. Balabin, Ilya Cathey, Tommy Transue, Thomas R. Al-Ghoul, Hussein Grulke, Chris Sobus, Jon R. Williams, Antony J. Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title | Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title_full | Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title_fullStr | Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title_full_unstemmed | Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title_short | Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns |
title_sort | linking in silico ms/ms spectra with chemistry data to improve identification of unknowns |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6677792/ https://www.ncbi.nlm.nih.gov/pubmed/31375670 http://dx.doi.org/10.1038/s41597-019-0145-z |
work_keys_str_mv | AT mceachranandrewd linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT balabinilya linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT catheytommy linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT transuethomasr linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT alghoulhussein linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT grulkechris linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT sobusjonr linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns AT williamsantonyj linkinginsilicomsmsspectrawithchemistrydatatoimproveidentificationofunknowns |