Cargando…
A secure distributed logistic regression protocol for the detection of rare adverse drug events
BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectivenes...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Group
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3628043/ https://www.ncbi.nlm.nih.gov/pubmed/22871397 http://dx.doi.org/10.1136/amiajnl-2011-000735 |
_version_ | 1782266372021026816 |
---|---|
author | El Emam, Khaled Samet, Saeed Arbuckle, Luk Tamblyn, Robyn Earle, Craig Kantarcioglu, Murat |
author_facet | El Emam, Khaled Samet, Saeed Arbuckle, Luk Tamblyn, Robyn Earle, Craig Kantarcioglu, Murat |
author_sort | El Emam, Khaled |
collection | PubMed |
description | BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. OBJECTIVE: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. METHODS: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. RESULTS: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. CONCLUSION: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. |
format | Online Article Text |
id | pubmed-3628043 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BMJ Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-36280432013-12-11 A secure distributed logistic regression protocol for the detection of rare adverse drug events El Emam, Khaled Samet, Saeed Arbuckle, Luk Tamblyn, Robyn Earle, Craig Kantarcioglu, Murat J Am Med Inform Assoc Research and Applications BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. OBJECTIVE: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. METHODS: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. RESULTS: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. CONCLUSION: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. BMJ Group 2013 2012-08-07 /pmc/articles/PMC3628043/ /pubmed/22871397 http://dx.doi.org/10.1136/amiajnl-2011-000735 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/3.0/ and http://creativecommons.org/licenses/by-nc/3.0/legalcode |
spellingShingle | Research and Applications El Emam, Khaled Samet, Saeed Arbuckle, Luk Tamblyn, Robyn Earle, Craig Kantarcioglu, Murat A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title | A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title_full | A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title_fullStr | A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title_full_unstemmed | A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title_short | A secure distributed logistic regression protocol for the detection of rare adverse drug events |
title_sort | secure distributed logistic regression protocol for the detection of rare adverse drug events |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3628043/ https://www.ncbi.nlm.nih.gov/pubmed/22871397 http://dx.doi.org/10.1136/amiajnl-2011-000735 |
work_keys_str_mv | AT elemamkhaled asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT sametsaeed asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT arbuckleluk asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT tamblynrobyn asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT earlecraig asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT kantarcioglumurat asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT elemamkhaled securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT sametsaeed securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT arbuckleluk securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT tamblynrobyn securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT earlecraig securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents AT kantarcioglumurat securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents |