Cargando…

MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects

Mass spectrometry-based proteomics plays a critical role in current biological and clinical research. Technical issues like data integration, missing value imputation, batch effect correction and the exploration of inter-connections amongst these technical issues, can produce errors but are not well...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, He, Lim, Kai Peng, Kong, Weijia, Gao, Huanhuan, Wong, Bertrand Jern Han, Phua, Ser Xian, Guo, Tiannan, Goh, Wilson Wen Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10693559/
https://www.ncbi.nlm.nih.gov/pubmed/38042886
http://dx.doi.org/10.1038/s41597-023-02779-8
_version_ 1785153187864379392
author Wang, He
Lim, Kai Peng
Kong, Weijia
Gao, Huanhuan
Wong, Bertrand Jern Han
Phua, Ser Xian
Guo, Tiannan
Goh, Wilson Wen Bin
author_facet Wang, He
Lim, Kai Peng
Kong, Weijia
Gao, Huanhuan
Wong, Bertrand Jern Han
Phua, Ser Xian
Guo, Tiannan
Goh, Wilson Wen Bin
author_sort Wang, He
collection PubMed
description Mass spectrometry-based proteomics plays a critical role in current biological and clinical research. Technical issues like data integration, missing value imputation, batch effect correction and the exploration of inter-connections amongst these technical issues, can produce errors but are not well studied. Although proteomic technologies have improved significantly in recent years, this alone cannot resolve these issues. What is needed are better algorithms and data processing knowledge. But to obtain these, we need appropriate proteomics datasets for exploration, investigation, and benchmarking. To meet this need, we developed MultiPro (Multi-purpose Proteome Resource), a resource comprising four comprehensive large-scale proteomics datasets with deliberate batch effects using the latest parallel accumulation-serial fragmentation in both Data-Dependent Acquisition (DDA) and Data Independent Acquisition (DIA) modes. Each dataset contains a balanced two-class design based on well-characterized and widely studied cell lines (A549 vs K562 or HCC1806 vs HS578T) with 48 or 36 biological and technical replicates altogether, allowing for investigation of a multitude of technical issues. These datasets allow for investigation of inter-connections between class and batch factors, or to develop approaches to compare and integrate data from DDA and DIA platforms.
format Online
Article
Text
id pubmed-10693559
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106935592023-12-04 MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects Wang, He Lim, Kai Peng Kong, Weijia Gao, Huanhuan Wong, Bertrand Jern Han Phua, Ser Xian Guo, Tiannan Goh, Wilson Wen Bin Sci Data Data Descriptor Mass spectrometry-based proteomics plays a critical role in current biological and clinical research. Technical issues like data integration, missing value imputation, batch effect correction and the exploration of inter-connections amongst these technical issues, can produce errors but are not well studied. Although proteomic technologies have improved significantly in recent years, this alone cannot resolve these issues. What is needed are better algorithms and data processing knowledge. But to obtain these, we need appropriate proteomics datasets for exploration, investigation, and benchmarking. To meet this need, we developed MultiPro (Multi-purpose Proteome Resource), a resource comprising four comprehensive large-scale proteomics datasets with deliberate batch effects using the latest parallel accumulation-serial fragmentation in both Data-Dependent Acquisition (DDA) and Data Independent Acquisition (DIA) modes. Each dataset contains a balanced two-class design based on well-characterized and widely studied cell lines (A549 vs K562 or HCC1806 vs HS578T) with 48 or 36 biological and technical replicates altogether, allowing for investigation of a multitude of technical issues. These datasets allow for investigation of inter-connections between class and batch factors, or to develop approaches to compare and integrate data from DDA and DIA platforms. Nature Publishing Group UK 2023-12-02 /pmc/articles/PMC10693559/ /pubmed/38042886 http://dx.doi.org/10.1038/s41597-023-02779-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Data Descriptor
Wang, He
Lim, Kai Peng
Kong, Weijia
Gao, Huanhuan
Wong, Bertrand Jern Han
Phua, Ser Xian
Guo, Tiannan
Goh, Wilson Wen Bin
MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title_full MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title_fullStr MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title_full_unstemmed MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title_short MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects
title_sort multipro: dda-pasef and diapasef acquired cell line proteomic datasets with deliberate batch effects
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10693559/
https://www.ncbi.nlm.nih.gov/pubmed/38042886
http://dx.doi.org/10.1038/s41597-023-02779-8
work_keys_str_mv AT wanghe multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT limkaipeng multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT kongweijia multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT gaohuanhuan multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT wongbertrandjernhan multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT phuaserxian multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT guotiannan multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects
AT gohwilsonwenbin multiproddapasefanddiapasefacquiredcelllineproteomicdatasetswithdeliberatebatcheffects