Cargando…

Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review

INTRODUCTION: The challenges in identifying a cohort of people with a rare condition can be addressed by routinely collected, population-scale electronic health record (EHR) data, which provide large volumes of data at a national level. This paper describes the challenges of accurately identifying a...

Descripción completa

Detalles Bibliográficos
Autores principales: Griffiths, R, Schlüter, DK, Akbari, A, Cosgriff, R, Tucker, D, Taylor-Robinson, D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Swansea University 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7898022/
https://www.ncbi.nlm.nih.gov/pubmed/33644411
http://dx.doi.org/10.23889/ijpds.v5i1.1346
_version_ 1783653786432045056
author Griffiths, R
Schlüter, DK
Akbari, A
Cosgriff, R
Tucker, D
Taylor-Robinson, D
author_facet Griffiths, R
Schlüter, DK
Akbari, A
Cosgriff, R
Tucker, D
Taylor-Robinson, D
author_sort Griffiths, R
collection PubMed
description INTRODUCTION: The challenges in identifying a cohort of people with a rare condition can be addressed by routinely collected, population-scale electronic health record (EHR) data, which provide large volumes of data at a national level. This paper describes the challenges of accurately identifying a cohort of children with Cystic Fibrosis (CF) using EHR and their validation against the UK CF Registry. OBJECTIVES: To establish a proof of principle and provide insight into the merits of linked data in CF research; to identify the benefits of access to multiple data sources, in particular the UK CF Registry data, and to demonstrate the opportunity it represents as a resource for future CF research. METHODS: Three EHR data sources were used to identify children with CF born in Wales between 1(st) January 1998 and 31(st) August 2015 within the Secure Anonymised Information Linkage (SAIL) Databank. The UK CF Registry was later acquired by SAIL and linked to the EHR cohort to validate the cases and explore the reasons for misclassifications. RESULTS: We identified 352 children with CF in the three EHR data sources. This was greater than expected based on historical incidence rates in Wales. Subsequent validation using the UK CF Registry found that 257 (73%) of these were true cases. Approximately 98.7% (156/158) of individuals identified as CF cases in all three EHR data sources were confirmed as true cases; but this was only the case for 19.8% (20/101) of all those identified in just a single data source. CONCLUSION: Identifying health conditions in EHR data can be challenging, so data quality assurance and validation is important or the merit of the research is undermined. This retrospective review identifies some of the challenges in identifying CF cases and demonstrates the benefits of linking cases across multiple data sources to improve quality.
format Online
Article
Text
id pubmed-7898022
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Swansea University
record_format MEDLINE/PubMed
spelling pubmed-78980222021-02-26 Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review Griffiths, R Schlüter, DK Akbari, A Cosgriff, R Tucker, D Taylor-Robinson, D Int J Popul Data Sci Population Data Science INTRODUCTION: The challenges in identifying a cohort of people with a rare condition can be addressed by routinely collected, population-scale electronic health record (EHR) data, which provide large volumes of data at a national level. This paper describes the challenges of accurately identifying a cohort of children with Cystic Fibrosis (CF) using EHR and their validation against the UK CF Registry. OBJECTIVES: To establish a proof of principle and provide insight into the merits of linked data in CF research; to identify the benefits of access to multiple data sources, in particular the UK CF Registry data, and to demonstrate the opportunity it represents as a resource for future CF research. METHODS: Three EHR data sources were used to identify children with CF born in Wales between 1(st) January 1998 and 31(st) August 2015 within the Secure Anonymised Information Linkage (SAIL) Databank. The UK CF Registry was later acquired by SAIL and linked to the EHR cohort to validate the cases and explore the reasons for misclassifications. RESULTS: We identified 352 children with CF in the three EHR data sources. This was greater than expected based on historical incidence rates in Wales. Subsequent validation using the UK CF Registry found that 257 (73%) of these were true cases. Approximately 98.7% (156/158) of individuals identified as CF cases in all three EHR data sources were confirmed as true cases; but this was only the case for 19.8% (20/101) of all those identified in just a single data source. CONCLUSION: Identifying health conditions in EHR data can be challenging, so data quality assurance and validation is important or the merit of the research is undermined. This retrospective review identifies some of the challenges in identifying CF cases and demonstrates the benefits of linking cases across multiple data sources to improve quality. Swansea University 2020-08-11 /pmc/articles/PMC7898022/ /pubmed/33644411 http://dx.doi.org/10.23889/ijpds.v5i1.1346 Text en https://creativecommons.org/licences/by/4.0/ This work is licenced under a Creative Commons Attribution 4.0 International License.
spellingShingle Population Data Science
Griffiths, R
Schlüter, DK
Akbari, A
Cosgriff, R
Tucker, D
Taylor-Robinson, D
Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title_full Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title_fullStr Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title_full_unstemmed Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title_short Identifying children with Cystic Fibrosis in population-scale routinely collected data in Wales: A Retrospective Review
title_sort identifying children with cystic fibrosis in population-scale routinely collected data in wales: a retrospective review
topic Population Data Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7898022/
https://www.ncbi.nlm.nih.gov/pubmed/33644411
http://dx.doi.org/10.23889/ijpds.v5i1.1346
work_keys_str_mv AT griffithsr identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview
AT schluterdk identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview
AT akbaria identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview
AT cosgriffr identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview
AT tuckerd identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview
AT taylorrobinsond identifyingchildrenwithcysticfibrosisinpopulationscaleroutinelycollecteddatainwalesaretrospectivereview