Cargando…

Privacy-preserving cancer type prediction with homomorphic encryption

Cancer genomics tailors diagnosis and treatment based on an individual’s genetic information and is the crux of precision medicine. However, analysis and maintenance of high volume of genetic mutation data to build a machine learning (ML) model to predict the cancer type is a computationally expensi...

Descripción completa

Detalles Bibliográficos
Autores principales: Sarkar, Esha, Chielle, Eduardo, Gursoy, Gamze, Chen, Leo, Gerstein, Mark, Maniatakos, Michail
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9886900/
https://www.ncbi.nlm.nih.gov/pubmed/36717667
http://dx.doi.org/10.1038/s41598-023-28481-8
_version_ 1784880220942106624
author Sarkar, Esha
Chielle, Eduardo
Gursoy, Gamze
Chen, Leo
Gerstein, Mark
Maniatakos, Michail
author_facet Sarkar, Esha
Chielle, Eduardo
Gursoy, Gamze
Chen, Leo
Gerstein, Mark
Maniatakos, Michail
author_sort Sarkar, Esha
collection PubMed
description Cancer genomics tailors diagnosis and treatment based on an individual’s genetic information and is the crux of precision medicine. However, analysis and maintenance of high volume of genetic mutation data to build a machine learning (ML) model to predict the cancer type is a computationally expensive task and is often outsourced to powerful cloud servers, raising critical privacy concerns for patients’ data. Homomorphic encryption (HE) enables computation on encrypted data, thus, providing cryptographic guarantees to protect privacy. But restrictive overheads of encrypted computation deter its usage. In this work, we explore the challenges of privacy preserving cancer type prediction using a dataset consisting of more than 2 million genetic mutations from 2713 patients for several cancer types by building a highly accurate ML model and then implementing its privacy preserving version in HE. Our solution for cancer type inference encodes somatic mutations based on their impact on the cancer genomes into the feature space and then uses statistical tests for feature selection. We propose a fast matrix multiplication algorithm for HE-based model. Our final model achieves 0.98 micro-average area under curve improving accuracy from 70.08 to 83.61% , being 550 times faster than the standard matrix multiplication-based privacy-preserving models. Our tool can be found at https://github.com/momalab/octal-candet.
format Online
Article
Text
id pubmed-9886900
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-98869002023-02-01 Privacy-preserving cancer type prediction with homomorphic encryption Sarkar, Esha Chielle, Eduardo Gursoy, Gamze Chen, Leo Gerstein, Mark Maniatakos, Michail Sci Rep Article Cancer genomics tailors diagnosis and treatment based on an individual’s genetic information and is the crux of precision medicine. However, analysis and maintenance of high volume of genetic mutation data to build a machine learning (ML) model to predict the cancer type is a computationally expensive task and is often outsourced to powerful cloud servers, raising critical privacy concerns for patients’ data. Homomorphic encryption (HE) enables computation on encrypted data, thus, providing cryptographic guarantees to protect privacy. But restrictive overheads of encrypted computation deter its usage. In this work, we explore the challenges of privacy preserving cancer type prediction using a dataset consisting of more than 2 million genetic mutations from 2713 patients for several cancer types by building a highly accurate ML model and then implementing its privacy preserving version in HE. Our solution for cancer type inference encodes somatic mutations based on their impact on the cancer genomes into the feature space and then uses statistical tests for feature selection. We propose a fast matrix multiplication algorithm for HE-based model. Our final model achieves 0.98 micro-average area under curve improving accuracy from 70.08 to 83.61% , being 550 times faster than the standard matrix multiplication-based privacy-preserving models. Our tool can be found at https://github.com/momalab/octal-candet. Nature Publishing Group UK 2023-01-30 /pmc/articles/PMC9886900/ /pubmed/36717667 http://dx.doi.org/10.1038/s41598-023-28481-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Sarkar, Esha
Chielle, Eduardo
Gursoy, Gamze
Chen, Leo
Gerstein, Mark
Maniatakos, Michail
Privacy-preserving cancer type prediction with homomorphic encryption
title Privacy-preserving cancer type prediction with homomorphic encryption
title_full Privacy-preserving cancer type prediction with homomorphic encryption
title_fullStr Privacy-preserving cancer type prediction with homomorphic encryption
title_full_unstemmed Privacy-preserving cancer type prediction with homomorphic encryption
title_short Privacy-preserving cancer type prediction with homomorphic encryption
title_sort privacy-preserving cancer type prediction with homomorphic encryption
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9886900/
https://www.ncbi.nlm.nih.gov/pubmed/36717667
http://dx.doi.org/10.1038/s41598-023-28481-8
work_keys_str_mv AT sarkaresha privacypreservingcancertypepredictionwithhomomorphicencryption
AT chielleeduardo privacypreservingcancertypepredictionwithhomomorphicencryption
AT gursoygamze privacypreservingcancertypepredictionwithhomomorphicencryption
AT chenleo privacypreservingcancertypepredictionwithhomomorphicencryption
AT gersteinmark privacypreservingcancertypepredictionwithhomomorphicencryption
AT maniatakosmichail privacypreservingcancertypepredictionwithhomomorphicencryption