Cargando…

Representation transfer for differentially private drug sensitivity prediction

MOTIVATION: Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine...

Descripción completa

Detalles Bibliográficos
Autores principales:	Niinimäki, Teppo, Heikkilä, Mikko A, Honkela, Antti, Kaski, Samuel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Ismb/Eccb 2019 Conference Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612875/ https://www.ncbi.nlm.nih.gov/pubmed/31510659 http://dx.doi.org/10.1093/bioinformatics/btz373

_version_	1783432956851781632
author	Niinimäki, Teppo Heikkilä, Mikko A Honkela, Antti Kaski, Samuel
author_facet	Niinimäki, Teppo Heikkilä, Mikko A Honkela, Antti Kaski, Samuel
author_sort	Niinimäki, Teppo
collection	PubMed
description	MOTIVATION: Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information. RESULTS: We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, principal component analysis and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction. AVAILABILITY AND IMPLEMENTATION: Code used in the experiments is available at https://github.com/DPBayes/dp-representation-transfer.
format	Online Article Text
id	pubmed-6612875
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-66128752019-07-12 Representation transfer for differentially private drug sensitivity prediction Niinimäki, Teppo Heikkilä, Mikko A Honkela, Antti Kaski, Samuel Bioinformatics Ismb/Eccb 2019 Conference Proceedings MOTIVATION: Human genomic datasets often contain sensitive information that limits use and sharing of the data. In particular, simple anonymization strategies fail to provide sufficient level of protection for genomic data, because the data are inherently identifiable. Differentially private machine learning can help by guaranteeing that the published results do not leak too much information about any individual data point. Recent research has reached promising results on differentially private drug sensitivity prediction using gene expression data. Differentially private learning with genomic data is challenging because it is more difficult to guarantee privacy in high dimensions. Dimensionality reduction can help, but if the dimension reduction mapping is learned from the data, then it needs to be differentially private too, which can carry a significant privacy cost. Furthermore, the selection of any hyperparameters (such as the target dimensionality) needs to also avoid leaking private information. RESULTS: We study an approach that uses a large public dataset of similar type to learn a compact representation for differentially private learning. We compare three representation learning methods: variational autoencoders, principal component analysis and random projection. We solve two machine learning tasks on gene expression of cancer cell lines: cancer type classification, and drug sensitivity prediction. The experiments demonstrate significant benefit from all representation learning methods with variational autoencoders providing the most accurate predictions most often. Our results significantly improve over previous state-of-the-art in accuracy of differentially private drug sensitivity prediction. AVAILABILITY AND IMPLEMENTATION: Code used in the experiments is available at https://github.com/DPBayes/dp-representation-transfer. Oxford University Press 2019-07 2019-07-05 /pmc/articles/PMC6612875/ /pubmed/31510659 http://dx.doi.org/10.1093/bioinformatics/btz373 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Ismb/Eccb 2019 Conference Proceedings Niinimäki, Teppo Heikkilä, Mikko A Honkela, Antti Kaski, Samuel Representation transfer for differentially private drug sensitivity prediction
title	Representation transfer for differentially private drug sensitivity prediction
title_full	Representation transfer for differentially private drug sensitivity prediction
title_fullStr	Representation transfer for differentially private drug sensitivity prediction
title_full_unstemmed	Representation transfer for differentially private drug sensitivity prediction
title_short	Representation transfer for differentially private drug sensitivity prediction
title_sort	representation transfer for differentially private drug sensitivity prediction
topic	Ismb/Eccb 2019 Conference Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612875/ https://www.ncbi.nlm.nih.gov/pubmed/31510659 http://dx.doi.org/10.1093/bioinformatics/btz373
work_keys_str_mv	AT niinimakiteppo representationtransferfordifferentiallyprivatedrugsensitivityprediction AT heikkilamikkoa representationtransferfordifferentiallyprivatedrugsensitivityprediction AT honkelaantti representationtransferfordifferentiallyprivatedrugsensitivityprediction AT kaskisamuel representationtransferfordifferentiallyprivatedrugsensitivityprediction

Representation transfer for differentially private drug sensitivity prediction

Ejemplares similares