Cargando…

Simplified and Unified Access to Cancer Proteogenomic Data

[Image: see text] Comprehensive cancer data sets recently generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) offer great potential for advancing our understanding of how to combat cancer. These data sets include DNA, RNA, protein, and clinical characterization for tumor and normal...

Descripción completa

Detalles Bibliográficos
Autores principales: Lindgren, Caleb M., Adams, David W., Kimball, Benjamin, Boekweg, Hannah, Tayler, Sadie, Pugh, Samuel L., Payne, Samuel H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022323/
https://www.ncbi.nlm.nih.gov/pubmed/33560848
http://dx.doi.org/10.1021/acs.jproteome.0c00919
_version_ 1783674910510415872
author Lindgren, Caleb M.
Adams, David W.
Kimball, Benjamin
Boekweg, Hannah
Tayler, Sadie
Pugh, Samuel L.
Payne, Samuel H.
author_facet Lindgren, Caleb M.
Adams, David W.
Kimball, Benjamin
Boekweg, Hannah
Tayler, Sadie
Pugh, Samuel L.
Payne, Samuel H.
author_sort Lindgren, Caleb M.
collection PubMed
description [Image: see text] Comprehensive cancer data sets recently generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) offer great potential for advancing our understanding of how to combat cancer. These data sets include DNA, RNA, protein, and clinical characterization for tumor and normal samples from large cohorts of many different cancer types. The raw data are publicly available at various Cancer Research Data Commons. However, widespread reuse of these data sets is also facilitated by easy access to the processed quantitative data tables. We have created a data application programming interface (API) to distribute these processed tables, implemented as a Python package called cptac. We implement it such that users who prefer to work in R can easily use our package for data access and then transfer the data into R for analysis. Our package distributes the finalized processed CPTAC data sets in a consistent, up-to-date format. This consistency makes it easy to integrate the data with common graphing, statistical, and machine-learning packages for advanced analysis. Additionally, consistent formatting across all cancer types promotes the investigation of pan-cancer trends. The data API structure of directly streaming data within a programming environment enhances the reproducibility. Finally, with the accompanying tutorials, this package provides a novel resource for cancer research education. View the software documentation at https://paynelab.github.io/cptac/. View the GitHub repository at https://github.com/PayneLab/cptac.
format Online
Article
Text
id pubmed-8022323
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-80223232021-04-06 Simplified and Unified Access to Cancer Proteogenomic Data Lindgren, Caleb M. Adams, David W. Kimball, Benjamin Boekweg, Hannah Tayler, Sadie Pugh, Samuel L. Payne, Samuel H. J Proteome Res [Image: see text] Comprehensive cancer data sets recently generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) offer great potential for advancing our understanding of how to combat cancer. These data sets include DNA, RNA, protein, and clinical characterization for tumor and normal samples from large cohorts of many different cancer types. The raw data are publicly available at various Cancer Research Data Commons. However, widespread reuse of these data sets is also facilitated by easy access to the processed quantitative data tables. We have created a data application programming interface (API) to distribute these processed tables, implemented as a Python package called cptac. We implement it such that users who prefer to work in R can easily use our package for data access and then transfer the data into R for analysis. Our package distributes the finalized processed CPTAC data sets in a consistent, up-to-date format. This consistency makes it easy to integrate the data with common graphing, statistical, and machine-learning packages for advanced analysis. Additionally, consistent formatting across all cancer types promotes the investigation of pan-cancer trends. The data API structure of directly streaming data within a programming environment enhances the reproducibility. Finally, with the accompanying tutorials, this package provides a novel resource for cancer research education. View the software documentation at https://paynelab.github.io/cptac/. View the GitHub repository at https://github.com/PayneLab/cptac. American Chemical Society 2021-02-09 2021-04-02 /pmc/articles/PMC8022323/ /pubmed/33560848 http://dx.doi.org/10.1021/acs.jproteome.0c00919 Text en © 2021 The Authors. Published by American Chemical Society Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Lindgren, Caleb M.
Adams, David W.
Kimball, Benjamin
Boekweg, Hannah
Tayler, Sadie
Pugh, Samuel L.
Payne, Samuel H.
Simplified and Unified Access to Cancer Proteogenomic Data
title Simplified and Unified Access to Cancer Proteogenomic Data
title_full Simplified and Unified Access to Cancer Proteogenomic Data
title_fullStr Simplified and Unified Access to Cancer Proteogenomic Data
title_full_unstemmed Simplified and Unified Access to Cancer Proteogenomic Data
title_short Simplified and Unified Access to Cancer Proteogenomic Data
title_sort simplified and unified access to cancer proteogenomic data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022323/
https://www.ncbi.nlm.nih.gov/pubmed/33560848
http://dx.doi.org/10.1021/acs.jproteome.0c00919
work_keys_str_mv AT lindgrencalebm simplifiedandunifiedaccesstocancerproteogenomicdata
AT adamsdavidw simplifiedandunifiedaccesstocancerproteogenomicdata
AT kimballbenjamin simplifiedandunifiedaccesstocancerproteogenomicdata
AT boekweghannah simplifiedandunifiedaccesstocancerproteogenomicdata
AT taylersadie simplifiedandunifiedaccesstocancerproteogenomicdata
AT pughsamuell simplifiedandunifiedaccesstocancerproteogenomicdata
AT paynesamuelh simplifiedandunifiedaccesstocancerproteogenomicdata