Cargando…
Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial
Advancements in mass spectrometry‐based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much‐needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step‐by‐step protocol for the a...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8447595/ https://www.ncbi.nlm.nih.gov/pubmed/34432947 http://dx.doi.org/10.15252/msb.202110240 |
_version_ | 1784569049745719296 |
---|---|
author | Čuklina, Jelena Lee, Chloe H Williams, Evan G Sajic, Tatjana Collins, Ben C Rodríguez Martínez, María Sharma, Varun S Wendt, Fabian Goetze, Sandra Keele, Gregory R Wollscheid, Bernd Aebersold, Ruedi Pedrioli, Patrick G A |
author_facet | Čuklina, Jelena Lee, Chloe H Williams, Evan G Sajic, Tatjana Collins, Ben C Rodríguez Martínez, María Sharma, Varun S Wendt, Fabian Goetze, Sandra Keele, Gregory R Wollscheid, Bernd Aebersold, Ruedi Pedrioli, Patrick G A |
author_sort | Čuklina, Jelena |
collection | PubMed |
description | Advancements in mass spectrometry‐based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much‐needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step‐by‐step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, "proBatch", containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology. |
format | Online Article Text |
id | pubmed-8447595 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84475952021-10-06 Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial Čuklina, Jelena Lee, Chloe H Williams, Evan G Sajic, Tatjana Collins, Ben C Rodríguez Martínez, María Sharma, Varun S Wendt, Fabian Goetze, Sandra Keele, Gregory R Wollscheid, Bernd Aebersold, Ruedi Pedrioli, Patrick G A Mol Syst Biol Reviews Advancements in mass spectrometry‐based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much‐needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step‐by‐step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, "proBatch", containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology. John Wiley and Sons Inc. 2021-08-25 /pmc/articles/PMC8447595/ /pubmed/34432947 http://dx.doi.org/10.15252/msb.202110240 Text en © 2021 The Authors. Published under the terms of the CC BY 4.0 license https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Reviews Čuklina, Jelena Lee, Chloe H Williams, Evan G Sajic, Tatjana Collins, Ben C Rodríguez Martínez, María Sharma, Varun S Wendt, Fabian Goetze, Sandra Keele, Gregory R Wollscheid, Bernd Aebersold, Ruedi Pedrioli, Patrick G A Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title | Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title_full | Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title_fullStr | Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title_full_unstemmed | Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title_short | Diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
title_sort | diagnostics and correction of batch effects in large‐scale proteomic studies: a tutorial |
topic | Reviews |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8447595/ https://www.ncbi.nlm.nih.gov/pubmed/34432947 http://dx.doi.org/10.15252/msb.202110240 |
work_keys_str_mv | AT cuklinajelena diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT leechloeh diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT williamsevang diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT sajictatjana diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT collinsbenc diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT rodriguezmartinezmaria diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT sharmavaruns diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT wendtfabian diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT goetzesandra diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT keelegregoryr diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT wollscheidbernd diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT aebersoldruedi diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial AT pedriolipatrickga diagnosticsandcorrectionofbatcheffectsinlargescaleproteomicstudiesatutorial |