Cargando…
Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data
Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individu...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9259987/ https://www.ncbi.nlm.nih.gov/pubmed/35814415 http://dx.doi.org/10.3389/fonc.2022.879607 |
_version_ | 1784741912757927936 |
---|---|
author | Islam, Md. Mohaiminul Mohammed, Noman Wang, Yang Hu, Pingzhao |
author_facet | Islam, Md. Mohaiminul Mohammed, Noman Wang, Yang Hu, Pingzhao |
author_sort | Islam, Md. Mohaiminul |
collection | PubMed |
description | Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work. |
format | Online Article Text |
id | pubmed-9259987 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92599872022-07-08 Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data Islam, Md. Mohaiminul Mohammed, Noman Wang, Yang Hu, Pingzhao Front Oncol Oncology Proper analysis of high-dimensional human genomic data is necessary to increase human knowledge about fundamental biological questions such as disease associations and drug sensitivity. However, such data contain sensitive private information about individuals and can be used to identify an individual (i.e., privacy violation) uniquely. Therefore, raw genomic datasets cannot be publicly published or shared with researchers. The recent success of deep learning (DL) in diverse problems proved its suitability for analyzing the high volume of high-dimensional genomic data. Still, DL-based models leak information about the training samples. To overcome this challenge, we can incorporate differential privacy mechanisms into the DL analysis framework as differential privacy can protect individuals’ privacy. We proposed a differential privacy based DL framework to solve two biological problems: breast cancer status (BCS) and cancer type (CT) classification, and drug sensitivity prediction. To predict BCS and CT using genomic data, we built a differential private (DP) deep autoencoder (dpAE) using private gene expression datasets that performs low-dimensional data representation learning. We used dpAE features to build multiple DP binary classifiers to predict BCS and CT in any individual. To predict drug sensitivity, we used the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. We extracted GDSC’s dpAE features to build our DP drug sensitivity prediction model for 265 drugs. Evaluation of our proposed DP framework shows that it achieves improved prediction performance in predicting BCS, CT, and drug sensitivity than the previously published DP work. Frontiers Media S.A. 2022-06-23 /pmc/articles/PMC9259987/ /pubmed/35814415 http://dx.doi.org/10.3389/fonc.2022.879607 Text en Copyright © 2022 Islam, Mohammed, Wang and Hu https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Oncology Islam, Md. Mohaiminul Mohammed, Noman Wang, Yang Hu, Pingzhao Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_full | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_fullStr | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_full_unstemmed | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_short | Differential Private Deep Learning Models for Analyzing Breast Cancer Omics Data |
title_sort | differential private deep learning models for analyzing breast cancer omics data |
topic | Oncology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9259987/ https://www.ncbi.nlm.nih.gov/pubmed/35814415 http://dx.doi.org/10.3389/fonc.2022.879607 |
work_keys_str_mv | AT islammdmohaiminul differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT mohammednoman differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT wangyang differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata AT hupingzhao differentialprivatedeeplearningmodelsforanalyzingbreastcanceromicsdata |