Cargando…
The SAIL databank: linking multiple health and social care datasets
BACKGROUND: Vast amounts of data are collected about patients and service users in the course of health and social care service delivery. Electronic data systems for patient records have the potential to revolutionise service delivery and research. But in order to achieve this, it is essential that...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648953/ https://www.ncbi.nlm.nih.gov/pubmed/19149883 http://dx.doi.org/10.1186/1472-6947-9-3 |
_version_ | 1782164999752384512 |
---|---|
author | Lyons, Ronan A Jones, Kerina H John, Gareth Brooks, Caroline J Verplancke, Jean-Philippe Ford, David V Brown, Ginevra Leake, Ken |
author_facet | Lyons, Ronan A Jones, Kerina H John, Gareth Brooks, Caroline J Verplancke, Jean-Philippe Ford, David V Brown, Ginevra Leake, Ken |
author_sort | Lyons, Ronan A |
collection | PubMed |
description | BACKGROUND: Vast amounts of data are collected about patients and service users in the course of health and social care service delivery. Electronic data systems for patient records have the potential to revolutionise service delivery and research. But in order to achieve this, it is essential that the ability to link the data at the individual record level be retained whilst adhering to the principles of information governance. The SAIL (Secure Anonymised Information Linkage) databank has been established using disparate datasets, and over 500 million records from multiple health and social care service providers have been loaded to date, with further growth in progress. METHODS: Having established the infrastructure of the databank, the aim of this work was to develop and implement an accurate matching process to enable the assignment of a unique Anonymous Linking Field (ALF) to person-based records to make the databank ready for record-linkage research studies. An SQL-based matching algorithm (MACRAL, Matching Algorithm for Consistent Results in Anonymised Linkage) was developed for this purpose. Firstly the suitability of using a valid NHS number as the basis of a unique identifier was assessed using MACRAL. Secondly, MACRAL was applied in turn to match primary care, secondary care and social services datasets to the NHS Administrative Register (NHSAR), to assess the efficacy of this process, and the optimum matching technique. RESULTS: The validation of using the NHS number yielded specificity values > 99.8% and sensitivity values > 94.6% using probabilistic record linkage (PRL) at the 50% threshold, and error rates were < 0.2%. A range of techniques for matching datasets to the NHSAR were applied and the optimum technique resulted in sensitivity values of: 99.9% for a GP dataset from primary care, 99.3% for a PEDW dataset from secondary care and 95.2% for the PARIS database from social care. CONCLUSION: With the infrastructure that has been put in place, the reliable matching process that has been developed enables an ALF to be consistently allocated to records in the databank. The SAIL databank represents a research-ready platform for record-linkage studies. |
format | Text |
id | pubmed-2648953 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26489532009-03-03 The SAIL databank: linking multiple health and social care datasets Lyons, Ronan A Jones, Kerina H John, Gareth Brooks, Caroline J Verplancke, Jean-Philippe Ford, David V Brown, Ginevra Leake, Ken BMC Med Inform Decis Mak Research Article BACKGROUND: Vast amounts of data are collected about patients and service users in the course of health and social care service delivery. Electronic data systems for patient records have the potential to revolutionise service delivery and research. But in order to achieve this, it is essential that the ability to link the data at the individual record level be retained whilst adhering to the principles of information governance. The SAIL (Secure Anonymised Information Linkage) databank has been established using disparate datasets, and over 500 million records from multiple health and social care service providers have been loaded to date, with further growth in progress. METHODS: Having established the infrastructure of the databank, the aim of this work was to develop and implement an accurate matching process to enable the assignment of a unique Anonymous Linking Field (ALF) to person-based records to make the databank ready for record-linkage research studies. An SQL-based matching algorithm (MACRAL, Matching Algorithm for Consistent Results in Anonymised Linkage) was developed for this purpose. Firstly the suitability of using a valid NHS number as the basis of a unique identifier was assessed using MACRAL. Secondly, MACRAL was applied in turn to match primary care, secondary care and social services datasets to the NHS Administrative Register (NHSAR), to assess the efficacy of this process, and the optimum matching technique. RESULTS: The validation of using the NHS number yielded specificity values > 99.8% and sensitivity values > 94.6% using probabilistic record linkage (PRL) at the 50% threshold, and error rates were < 0.2%. A range of techniques for matching datasets to the NHSAR were applied and the optimum technique resulted in sensitivity values of: 99.9% for a GP dataset from primary care, 99.3% for a PEDW dataset from secondary care and 95.2% for the PARIS database from social care. CONCLUSION: With the infrastructure that has been put in place, the reliable matching process that has been developed enables an ALF to be consistently allocated to records in the databank. The SAIL databank represents a research-ready platform for record-linkage studies. BioMed Central 2009-01-16 /pmc/articles/PMC2648953/ /pubmed/19149883 http://dx.doi.org/10.1186/1472-6947-9-3 Text en Copyright ©2009 Lyons et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Lyons, Ronan A Jones, Kerina H John, Gareth Brooks, Caroline J Verplancke, Jean-Philippe Ford, David V Brown, Ginevra Leake, Ken The SAIL databank: linking multiple health and social care datasets |
title | The SAIL databank: linking multiple health and social care datasets |
title_full | The SAIL databank: linking multiple health and social care datasets |
title_fullStr | The SAIL databank: linking multiple health and social care datasets |
title_full_unstemmed | The SAIL databank: linking multiple health and social care datasets |
title_short | The SAIL databank: linking multiple health and social care datasets |
title_sort | sail databank: linking multiple health and social care datasets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648953/ https://www.ncbi.nlm.nih.gov/pubmed/19149883 http://dx.doi.org/10.1186/1472-6947-9-3 |
work_keys_str_mv | AT lyonsronana thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT joneskerinah thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT johngareth thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT brookscarolinej thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT verplanckejeanphilippe thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT forddavidv thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT brownginevra thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT leakeken thesaildatabanklinkingmultiplehealthandsocialcaredatasets AT lyonsronana saildatabanklinkingmultiplehealthandsocialcaredatasets AT joneskerinah saildatabanklinkingmultiplehealthandsocialcaredatasets AT johngareth saildatabanklinkingmultiplehealthandsocialcaredatasets AT brookscarolinej saildatabanklinkingmultiplehealthandsocialcaredatasets AT verplanckejeanphilippe saildatabanklinkingmultiplehealthandsocialcaredatasets AT forddavidv saildatabanklinkingmultiplehealthandsocialcaredatasets AT brownginevra saildatabanklinkingmultiplehealthandsocialcaredatasets AT leakeken saildatabanklinkingmultiplehealthandsocialcaredatasets |