Cargando…
Privacy preserving validation for multiomic prediction models
Reproducibility of results obtained using ribonucleic acid (RNA) data across labs remains a major hurdle in cancer research. Often, molecular predictors trained on one dataset cannot be applied to another due to differences in RNA library preparation and quantification, which inhibits the validation...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116386/ https://www.ncbi.nlm.nih.gov/pubmed/35388408 http://dx.doi.org/10.1093/bib/bbac110 |
_version_ | 1784710102445457408 |
---|---|
author | Ahmed, Talal Carty, Mark A Wenric, Stephane Dry, Jonathan R Salahudeen, Ameen A Khan, Aly A Lefkofsky, Eric Stumpe, Martin C Pelossof, Raphael |
author_facet | Ahmed, Talal Carty, Mark A Wenric, Stephane Dry, Jonathan R Salahudeen, Ameen A Khan, Aly A Lefkofsky, Eric Stumpe, Martin C Pelossof, Raphael |
author_sort | Ahmed, Talal |
collection | PubMed |
description | Reproducibility of results obtained using ribonucleic acid (RNA) data across labs remains a major hurdle in cancer research. Often, molecular predictors trained on one dataset cannot be applied to another due to differences in RNA library preparation and quantification, which inhibits the validation of predictors across labs. While current RNA correction algorithms reduce these differences, they require simultaneous access to patient-level data from all datasets, which necessitates the sharing of training data for predictors when sharing predictors. Here, we describe SpinAdapt, an unsupervised RNA correction algorithm that enables the transfer of molecular models without requiring access to patient-level data. It computes data corrections only via aggregate statistics of each dataset, thereby maintaining patient data privacy. Despite an inherent trade-off between privacy and performance, SpinAdapt outperforms current correction methods, like Seurat and ComBat, on publicly available cancer studies, including TCGA and ICGC. Furthermore, SpinAdapt can correct new samples, thereby enabling unbiased evaluation on validation cohorts. We expect this novel correction paradigm to enhance research reproducibility and to preserve patient privacy. |
format | Online Article Text |
id | pubmed-9116386 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91163862022-05-19 Privacy preserving validation for multiomic prediction models Ahmed, Talal Carty, Mark A Wenric, Stephane Dry, Jonathan R Salahudeen, Ameen A Khan, Aly A Lefkofsky, Eric Stumpe, Martin C Pelossof, Raphael Brief Bioinform Problem Solving Protocol Reproducibility of results obtained using ribonucleic acid (RNA) data across labs remains a major hurdle in cancer research. Often, molecular predictors trained on one dataset cannot be applied to another due to differences in RNA library preparation and quantification, which inhibits the validation of predictors across labs. While current RNA correction algorithms reduce these differences, they require simultaneous access to patient-level data from all datasets, which necessitates the sharing of training data for predictors when sharing predictors. Here, we describe SpinAdapt, an unsupervised RNA correction algorithm that enables the transfer of molecular models without requiring access to patient-level data. It computes data corrections only via aggregate statistics of each dataset, thereby maintaining patient data privacy. Despite an inherent trade-off between privacy and performance, SpinAdapt outperforms current correction methods, like Seurat and ComBat, on publicly available cancer studies, including TCGA and ICGC. Furthermore, SpinAdapt can correct new samples, thereby enabling unbiased evaluation on validation cohorts. We expect this novel correction paradigm to enhance research reproducibility and to preserve patient privacy. Oxford University Press 2022-04-06 /pmc/articles/PMC9116386/ /pubmed/35388408 http://dx.doi.org/10.1093/bib/bbac110 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Problem Solving Protocol Ahmed, Talal Carty, Mark A Wenric, Stephane Dry, Jonathan R Salahudeen, Ameen A Khan, Aly A Lefkofsky, Eric Stumpe, Martin C Pelossof, Raphael Privacy preserving validation for multiomic prediction models |
title | Privacy preserving validation for multiomic prediction models |
title_full | Privacy preserving validation for multiomic prediction models |
title_fullStr | Privacy preserving validation for multiomic prediction models |
title_full_unstemmed | Privacy preserving validation for multiomic prediction models |
title_short | Privacy preserving validation for multiomic prediction models |
title_sort | privacy preserving validation for multiomic prediction models |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9116386/ https://www.ncbi.nlm.nih.gov/pubmed/35388408 http://dx.doi.org/10.1093/bib/bbac110 |
work_keys_str_mv | AT ahmedtalal privacypreservingvalidationformultiomicpredictionmodels AT cartymarka privacypreservingvalidationformultiomicpredictionmodels AT wenricstephane privacypreservingvalidationformultiomicpredictionmodels AT dryjonathanr privacypreservingvalidationformultiomicpredictionmodels AT salahudeenameena privacypreservingvalidationformultiomicpredictionmodels AT khanalya privacypreservingvalidationformultiomicpredictionmodels AT lefkofskyeric privacypreservingvalidationformultiomicpredictionmodels AT stumpemartinc privacypreservingvalidationformultiomicpredictionmodels AT pelossofraphael privacypreservingvalidationformultiomicpredictionmodels |