Cargando…

A data flow process for confidential data and its application in a health research project

BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection ri...

Descripción completa

Detalles Bibliográficos
Autores principales: Crossfield, Samantha S. R., Zucker, Kieran, Baxter, Paul, Wright, Penny, Fistein, Jon, Markham, Alex F., Birkin, Mark, Glaser, Adam W., Hall, Geoff
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782367/
https://www.ncbi.nlm.nih.gov/pubmed/35061834
http://dx.doi.org/10.1371/journal.pone.0262609
_version_ 1784638299044839424
author Crossfield, Samantha S. R.
Zucker, Kieran
Baxter, Paul
Wright, Penny
Fistein, Jon
Markham, Alex F.
Birkin, Mark
Glaser, Adam W.
Hall, Geoff
author_facet Crossfield, Samantha S. R.
Zucker, Kieran
Baxter, Paul
Wright, Penny
Fistein, Jon
Markham, Alex F.
Birkin, Mark
Glaser, Adam W.
Hall, Geoff
author_sort Crossfield, Samantha S. R.
collection PubMed
description BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. METHODS: We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. RESULTS: We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. CONCLUSIONS: Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research.
format Online
Article
Text
id pubmed-8782367
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-87823672022-01-22 A data flow process for confidential data and its application in a health research project Crossfield, Samantha S. R. Zucker, Kieran Baxter, Paul Wright, Penny Fistein, Jon Markham, Alex F. Birkin, Mark Glaser, Adam W. Hall, Geoff PLoS One Research Article BACKGROUND: The use of linked healthcare data in research has the potential to make major contributions to knowledge generation and service improvement. However, using healthcare data for secondary purposes raises legal and ethical concerns relating to confidentiality, privacy and data protection rights. Using a linkage and anonymisation approach that processes data lawfully and in line with ethical best practice to create an anonymous (non-personal) dataset can address these concerns, yet there is no set approach for defining all of the steps involved in such data flow end-to-end. We aimed to define such an approach with clear steps for dataset creation, and to describe its utilisation in a case study linking healthcare data. METHODS: We developed a data flow protocol that generates pseudonymous datasets that can be reversibly linked, or irreversibly linked to form an anonymous research dataset. It was designed and implemented by the Comprehensive Patient Records (CPR) study in Leeds, UK. RESULTS: We defined a clear approach that received ethico-legal approval for use in creating an anonymous research dataset. Our approach used individual-level linkage through a mechanism that is not computer-intensive and was rendered irreversible to both data providers and processors. We successfully applied it in the CPR study to hospital and general practice and community electronic health record data from two providers, along with patient reported outcomes, for 365,193 patients. The resultant anonymous research dataset is available via DATA-CAN, the Health Data Research Hub for Cancer in the UK. CONCLUSIONS: Through ethical, legal and academic review, we believe that we contribute a defined approach that represents a framework that exceeds current minimum standards for effective pseudonymisation and anonymisation. This paper describes our methods and provides supporting information to facilitate the use of this approach in research. Public Library of Science 2022-01-21 /pmc/articles/PMC8782367/ /pubmed/35061834 http://dx.doi.org/10.1371/journal.pone.0262609 Text en © 2022 Crossfield et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Crossfield, Samantha S. R.
Zucker, Kieran
Baxter, Paul
Wright, Penny
Fistein, Jon
Markham, Alex F.
Birkin, Mark
Glaser, Adam W.
Hall, Geoff
A data flow process for confidential data and its application in a health research project
title A data flow process for confidential data and its application in a health research project
title_full A data flow process for confidential data and its application in a health research project
title_fullStr A data flow process for confidential data and its application in a health research project
title_full_unstemmed A data flow process for confidential data and its application in a health research project
title_short A data flow process for confidential data and its application in a health research project
title_sort data flow process for confidential data and its application in a health research project
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8782367/
https://www.ncbi.nlm.nih.gov/pubmed/35061834
http://dx.doi.org/10.1371/journal.pone.0262609
work_keys_str_mv AT crossfieldsamanthasr adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT zuckerkieran adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT baxterpaul adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT wrightpenny adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT fisteinjon adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT markhamalexf adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT birkinmark adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT glaseradamw adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT hallgeoff adataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT crossfieldsamanthasr dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT zuckerkieran dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT baxterpaul dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT wrightpenny dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT fisteinjon dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT markhamalexf dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT birkinmark dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT glaseradamw dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject
AT hallgeoff dataflowprocessforconfidentialdataanditsapplicationinahealthresearchproject