Cargando…
A data flow process for confidential data and its application in a health research project
BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection ri...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782367/ https://www.ncbi.nlm.nih.gov/pubmed/35061834 http://dx.doi.org/10.1371/journal.pone.0262609 |
_version_ | 1784638299044839424 |
---|---|
author | Crossfield, Samantha S. R. Zucker, Kieran Baxter, Paul Wright, Penny Fistein, Jon Markham, Alex F. Birkin, Mark Glaser, Adam W. Hall, Geoff |
author_facet | Crossfield, Samantha S. R. Zucker, Kieran Baxter, Paul Wright, Penny Fistein, Jon Markham, Alex F. Birkin, Mark Glaser, Adam W. Hall, Geoff |
author_sort | Crossfield, Samantha S. R. |
collection | PubMed |
description | BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. METHODS: We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. RESULTS: We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. CONCLUSIONS: Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research. |
format | Online Article Text |
id | pubmed-8782367 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-87823672022-01-22 A data flow process for confidential data and its application in a health research project Crossfield, Samantha S. R. Zucker, Kieran Baxter, Paul Wright, Penny Fistein, Jon Markham, Alex F. Birkin, Mark Glaser, Adam W. Hall, Geoff PLoS One Research Article BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. METHODS: We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. RESULTS: We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. CONCLUSIONS: Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research. Public Library of Science 2022-01-21 /pmc/articles/PMC8782367/ /pubmed/35061834 http://dx.doi.org/10.1371/journal.pone.0262609 Text en © 2022 Crossfield et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Crossfield, Samantha S. R. Zucker, Kieran Baxter, Paul Wright, Penny Fistein, Jon Markham, Alex F. Birkin, Mark Glaser, Adam W. Hall, Geoff A data flow process for confidential data and its application in a health research project |
title | A data flow process for confidential data and its application in a health research project |
title_full | A data flow process for confidential data and its application in a health research project |
title_fullStr | A data flow process for confidential data and its application in a health research project |
title_full_unstemmed | A data flow process for confidential data and its application in a health research project |
title_short | A data flow process for confidential data and its application in a health research project |
title_sort | data flow process for confidential data and its application in a health research project |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782367/ https://www.ncbi.nlm.nih.gov/pubmed/35061834 http://dx.doi.org/10.1371/journal.pone.0262609 |
work_keys_str_mv | AT crossfieldsamanthasr adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT zuckerkieran adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT baxterpaul adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT wrightpenny adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT fisteinjon adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT markhamalexf adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT birkinmark adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT glaseradamw adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT hallgeoff adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT crossfieldsamanthasr dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT zuckerkieran dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT baxterpaul dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT wrightpenny dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT fisteinjon dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT markhamalexf dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT birkinmark dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT glaseradamw dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject AT hallgeoff dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject |