Cargando…

A secure distributed logistic regression protocol for the detection of rare adverse drug events

BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectivenes...

Descripción completa

Detalles Bibliográficos
Autores principales: El Emam, Khaled, Samet, Saeed, Arbuckle, Luk, Tamblyn, Robyn, Earle, Craig, Kantarcioglu, Murat
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Group 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3628043/
https://www.ncbi.nlm.nih.gov/pubmed/22871397
http://dx.doi.org/10.1136/amiajnl-2011-000735
_version_ 1782266372021026816
author El Emam, Khaled
Samet, Saeed
Arbuckle, Luk
Tamblyn, Robyn
Earle, Craig
Kantarcioglu, Murat
author_facet El Emam, Khaled
Samet, Saeed
Arbuckle, Luk
Tamblyn, Robyn
Earle, Craig
Kantarcioglu, Murat
author_sort El Emam, Khaled
collection PubMed
description BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. OBJECTIVE: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. METHODS: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. RESULTS: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. CONCLUSION: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.
format Online
Article
Text
id pubmed-3628043
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BMJ Group
record_format MEDLINE/PubMed
spelling pubmed-36280432013-12-11 A secure distributed logistic regression protocol for the detection of rare adverse drug events El Emam, Khaled Samet, Saeed Arbuckle, Luk Tamblyn, Robyn Earle, Craig Kantarcioglu, Murat J Am Med Inform Assoc Research and Applications BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. OBJECTIVE: To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. METHODS: We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. RESULTS: The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. CONCLUSION: The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. BMJ Group 2013 2012-08-07 /pmc/articles/PMC3628043/ /pubmed/22871397 http://dx.doi.org/10.1136/amiajnl-2011-000735 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/3.0/ and http://creativecommons.org/licenses/by-nc/3.0/legalcode
spellingShingle Research and Applications
El Emam, Khaled
Samet, Saeed
Arbuckle, Luk
Tamblyn, Robyn
Earle, Craig
Kantarcioglu, Murat
A secure distributed logistic regression protocol for the detection of rare adverse drug events
title A secure distributed logistic regression protocol for the detection of rare adverse drug events
title_full A secure distributed logistic regression protocol for the detection of rare adverse drug events
title_fullStr A secure distributed logistic regression protocol for the detection of rare adverse drug events
title_full_unstemmed A secure distributed logistic regression protocol for the detection of rare adverse drug events
title_short A secure distributed logistic regression protocol for the detection of rare adverse drug events
title_sort secure distributed logistic regression protocol for the detection of rare adverse drug events
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3628043/
https://www.ncbi.nlm.nih.gov/pubmed/22871397
http://dx.doi.org/10.1136/amiajnl-2011-000735
work_keys_str_mv AT elemamkhaled asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT sametsaeed asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT arbuckleluk asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT tamblynrobyn asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT earlecraig asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT kantarcioglumurat asecuredistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT elemamkhaled securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT sametsaeed securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT arbuckleluk securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT tamblynrobyn securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT earlecraig securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents
AT kantarcioglumurat securedistributedlogisticregressionprotocolforthedetectionofrareadversedrugevents