Cargando…

ppx: Programmatic access to proteomics data repositories

The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computat...

Descripción completa

Detalles Bibliográficos
Autores principales: Fondrie, William E, Bittremieux, Wout, Noble, William S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457024/
https://www.ncbi.nlm.nih.gov/pubmed/34342226
http://dx.doi.org/10.1021/acs.jproteome.1c00454
_version_ 1784570992752852992
author Fondrie, William E
Bittremieux, Wout
Noble, William S
author_facet Fondrie, William E
Bittremieux, Wout
Noble, William S
author_sort Fondrie, William E
collection PubMed
description The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computational tools. Here, we present ppx, a Python package that provides easy, programmatic access to the data stored in ProteomeXchange repositories, such as PRIDE and MassIVE. The ppx package can either be used as a command line tool or a Python package to retrieve the files and metadata associated with a project when provided its identifier. To demonstrate how ppx enhances reproducible research, we used ppx within a Snakemake workflow to reanalyze a published dataset with the open modification search tool ANN-SoLo and compared our reanalysis to the original results. We show that ppx readily integrates into workflows and our reanalysis produced results consistent with the original analysis. We envision that ppx will be a valuable tool for creating reproducible analyses, providing tool developers easy access to data for development, testing, and benchmarking, and enabling the use of mass spectrometry data in data-intensive analyses. The ppx package is freely available and open source under the MIT license at: https://github.com/wfondrie/ppx
format Online
Article
Text
id pubmed-8457024
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-84570242022-09-03 ppx: Programmatic access to proteomics data repositories Fondrie, William E Bittremieux, Wout Noble, William S J Proteome Res Article The volume of proteomics and mass spectrometry data available in public repositories continues to grow at a rapid pace as more researchers embrace open science practices. Open access to the data behind scientific discoveries has become critical to validate published findings and develop new computational tools. Here, we present ppx, a Python package that provides easy, programmatic access to the data stored in ProteomeXchange repositories, such as PRIDE and MassIVE. The ppx package can either be used as a command line tool or a Python package to retrieve the files and metadata associated with a project when provided its identifier. To demonstrate how ppx enhances reproducible research, we used ppx within a Snakemake workflow to reanalyze a published dataset with the open modification search tool ANN-SoLo and compared our reanalysis to the original results. We show that ppx readily integrates into workflows and our reanalysis produced results consistent with the original analysis. We envision that ppx will be a valuable tool for creating reproducible analyses, providing tool developers easy access to data for development, testing, and benchmarking, and enabling the use of mass spectrometry data in data-intensive analyses. The ppx package is freely available and open source under the MIT license at: https://github.com/wfondrie/ppx 2021-08-03 2021-09-03 /pmc/articles/PMC8457024/ /pubmed/34342226 http://dx.doi.org/10.1021/acs.jproteome.1c00454 Text en https://creativecommons.org/licenses/by/4.0/It is made available under aCC-BY 4.0 International license. http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/)
spellingShingle Article
Fondrie, William E
Bittremieux, Wout
Noble, William S
ppx: Programmatic access to proteomics data repositories
title ppx: Programmatic access to proteomics data repositories
title_full ppx: Programmatic access to proteomics data repositories
title_fullStr ppx: Programmatic access to proteomics data repositories
title_full_unstemmed ppx: Programmatic access to proteomics data repositories
title_short ppx: Programmatic access to proteomics data repositories
title_sort ppx: programmatic access to proteomics data repositories
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457024/
https://www.ncbi.nlm.nih.gov/pubmed/34342226
http://dx.doi.org/10.1021/acs.jproteome.1c00454
work_keys_str_mv AT fondriewilliame ppxprogrammaticaccesstoproteomicsdatarepositories
AT bittremieuxwout ppxprogrammaticaccesstoproteomicsdatarepositories
AT noblewilliams ppxprogrammaticaccesstoproteomicsdatarepositories