Cargando…

Significance estimation for large scale metabolomics annotations by spectral matching

The annotation of small molecules in untargeted mass spectrometry relies on the matching of fragment spectra to reference library spectra. While various spectrum-spectrum match scores exist, the field lacks statistical methods for estimating the false discovery rates (FDR) of these annotations. We p...

Descripción completa

Detalles Bibliográficos
Autores principales: Scheubert, Kerstin, Hufsky, Franziska, Petras, Daniel, Wang, Mingxun, Nothias, Louis-Félix, Dührkop, Kai, Bandeira, Nuno, Dorrestein, Pieter C., Böcker, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5684233/
https://www.ncbi.nlm.nih.gov/pubmed/29133785
http://dx.doi.org/10.1038/s41467-017-01318-5
Descripción
Sumario:The annotation of small molecules in untargeted mass spectrometry relies on the matching of fragment spectra to reference library spectra. While various spectrum-spectrum match scores exist, the field lacks statistical methods for estimating the false discovery rates (FDR) of these annotations. We present empirical Bayes and target-decoy based methods to estimate the false discovery rate (FDR) for 70 public metabolomics data sets. We show that the spectral matching settings need to be adjusted for each project. By adjusting the scoring parameters and thresholds, the number of annotations rose, on average, by +139% (ranging from −92 up to +5705%) when compared with a default parameter set available at GNPS. The FDR estimation methods presented will enable a user to assess the scoring criteria for large scale analysis of mass spectrometry based metabolomics data that has been essential in the advancement of proteomics, transcriptomics, and genomics science.