Cargando…

ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics

[Image: see text] Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that...

Descripción completa

Detalles Bibliográficos
Autores principales: Rehfeldt, Tobias G., Gabriels, Ralf, Bouwmeester, Robbin, Gessulat, Siegfried, Neely, Benjamin A., Palmblad, Magnus, Perez-Riverol, Yasset, Schmidt, Tobias, Vizcaíno, Juan Antonio, Deutsch, Eric W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9903315/
https://www.ncbi.nlm.nih.gov/pubmed/36693629
http://dx.doi.org/10.1021/acs.jproteome.2c00629
_version_ 1784883446791798784
author Rehfeldt, Tobias G.
Gabriels, Ralf
Bouwmeester, Robbin
Gessulat, Siegfried
Neely, Benjamin A.
Palmblad, Magnus
Perez-Riverol, Yasset
Schmidt, Tobias
Vizcaíno, Juan Antonio
Deutsch, Eric W.
author_facet Rehfeldt, Tobias G.
Gabriels, Ralf
Bouwmeester, Robbin
Gessulat, Siegfried
Neely, Benjamin A.
Palmblad, Magnus
Perez-Riverol, Yasset
Schmidt, Tobias
Vizcaíno, Juan Antonio
Deutsch, Eric W.
author_sort Rehfeldt, Tobias G.
collection PubMed
description [Image: see text] Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML.
format Online
Article
Text
id pubmed-9903315
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-99033152023-02-08 ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics Rehfeldt, Tobias G. Gabriels, Ralf Bouwmeester, Robbin Gessulat, Siegfried Neely, Benjamin A. Palmblad, Magnus Perez-Riverol, Yasset Schmidt, Tobias Vizcaíno, Juan Antonio Deutsch, Eric W. J Proteome Res [Image: see text] Data set acquisition and curation are often the most difficult and time-consuming parts of a machine learning endeavor. This is especially true for proteomics-based liquid chromatography (LC) coupled to mass spectrometry (MS) data sets, due to the high levels of data reduction that occur between raw data and machine learning-ready data. Since predictive proteomics is an emerging field, when predicting peptide behavior in LC-MS setups, each lab often uses unique and complex data processing pipelines in order to maximize performance, at the cost of accessibility and reproducibility. For this reason we introduce ProteomicsML, an online resource for proteomics-based data sets and tutorials across most of the currently explored physicochemical peptide properties. This community-driven resource makes it simple to access data in easy-to-process formats, and contains easy-to-follow tutorials that allow new users to interact with even the most advanced algorithms in the field. ProteomicsML provides data sets that are useful for comparing state-of-the-art machine learning algorithms, as well as providing introductory material for teachers and newcomers to the field alike. The platform is freely available at https://www.proteomicsml.org/, and we welcome the entire proteomics community to contribute to the project at https://github.com/ProteomicsML/ProteomicsML. American Chemical Society 2023-01-24 /pmc/articles/PMC9903315/ /pubmed/36693629 http://dx.doi.org/10.1021/acs.jproteome.2c00629 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Rehfeldt, Tobias G.
Gabriels, Ralf
Bouwmeester, Robbin
Gessulat, Siegfried
Neely, Benjamin A.
Palmblad, Magnus
Perez-Riverol, Yasset
Schmidt, Tobias
Vizcaíno, Juan Antonio
Deutsch, Eric W.
ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title_full ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title_fullStr ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title_full_unstemmed ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title_short ProteomicsML: An Online Platform for Community-Curated Data sets and Tutorials for Machine Learning in Proteomics
title_sort proteomicsml: an online platform for community-curated data sets and tutorials for machine learning in proteomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9903315/
https://www.ncbi.nlm.nih.gov/pubmed/36693629
http://dx.doi.org/10.1021/acs.jproteome.2c00629
work_keys_str_mv AT rehfeldttobiasg proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT gabrielsralf proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT bouwmeesterrobbin proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT gessulatsiegfried proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT neelybenjamina proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT palmbladmagnus proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT perezriverolyasset proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT schmidttobias proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT vizcainojuanantonio proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics
AT deutschericw proteomicsmlanonlineplatformforcommunitycurateddatasetsandtutorialsformachinelearninginproteomics