Cargando…

Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data

Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individu...

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, Md. Mohaiminul, Mohammed, Noman, Wang, Yang, Hu, Pingzhao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9259987/
https://www.ncbi.nlm.nih.gov/pubmed/35814415
http://dx.doi.org/10.3389/fonc.2022.879607
_version_ 1784741912757927936
author Islam, Md. Mohaiminul
Mohammed, Noman
Wang, Yang
Hu, Pingzhao
author_facet Islam, Md. Mohaiminul
Mohammed, Noman
Wang, Yang
Hu, Pingzhao
author_sort Islam, Md. Mohaiminul
collection PubMed
description Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work.
format Online
Article
Text
id pubmed-9259987
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92599872022-07-08 Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data Islam, Md. Mohaiminul Mohammed, Noman Wang, Yang Hu, Pingzhao Front Oncol Oncology Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work. Frontiers Media S.A. 2022-06-23 /pmc/articles/PMC9259987/ /pubmed/35814415 http://dx.doi.org/10.3389/fonc.2022.879607 Text en Copyright © 2022 Islam, Mohammed, Wang and Hu https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Oncology
Islam, Md. Mohaiminul
Mohammed, Noman
Wang, Yang
Hu, Pingzhao
Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_full Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_fullStr Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_full_unstemmed Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_short Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
title_sort differential private deep learning models for analyzing breast cancer omics data
topic Oncology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9259987/
https://www.ncbi.nlm.nih.gov/pubmed/35814415
http://dx.doi.org/10.3389/fonc.2022.879607
work_keys_str_mv AT islammdmohaiminul differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT mohammednoman differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT wangyang differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata
AT hupingzhao differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata