Cargando…

Measuring follow-up time in routinely-collected health datasets: Challenges and solutions

A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, includin...

Descripción completa

Detalles Bibliográficos
Autores principales: Thayer, Daniel, Rees, Arfon, Kennedy, Jon, Collins, Huw, Harris, Dan, Halcox, Julian, Ruschetti, Luca, Noyce, Richard, Brooks, Caroline
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7012444/
https://www.ncbi.nlm.nih.gov/pubmed/32045428
http://dx.doi.org/10.1371/journal.pone.0228545
_version_ 1783496235291770880
author Thayer, Daniel
Rees, Arfon
Kennedy, Jon
Collins, Huw
Harris, Dan
Halcox, Julian
Ruschetti, Luca
Noyce, Richard
Brooks, Caroline
author_facet Thayer, Daniel
Rees, Arfon
Kennedy, Jon
Collins, Huw
Harris, Dan
Halcox, Julian
Ruschetti, Luca
Noyce, Richard
Brooks, Caroline
author_sort Thayer, Daniel
collection PubMed
description A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, including both life events and characteristics of the datasets themselves. An automated, customizable method of determining individuals’ presence was developed for the primary care dataset in Swansea University’s SAIL Databank. The primary care dataset covers only a portion of Wales, with 76% of practices participating. The start and end date of the data varies by practice. Additionally, individuals can change practices or leave Wales. To address these issues, a two step process was developed. First, the period for which each practice had data available was calculated by measuring changes in the rate of events recorded over time. Second, the registration records for each individual were simplified. Anomalies such as short gaps and overlaps were resolved by applying a set of rules. The result of these two analyses was a cleaned set of records indicating start and end dates of available primary care data for each individual. Analysis of GP records showed that 91.0% of events occurred within periods calculated as having available data by the algorithm. 98.4% of those events were observed at the same practice of registration as that computed by the algorithm. A standardized method for solving this common problem has enabled faster development of studies using this data set. Using a rigorous, tested, standardized method of verifying presence in the study population will also positively influence the quality of research.
format Online
Article
Text
id pubmed-7012444
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-70124442020-02-21 Measuring follow-up time in routinely-collected health datasets: Challenges and solutions Thayer, Daniel Rees, Arfon Kennedy, Jon Collins, Huw Harris, Dan Halcox, Julian Ruschetti, Luca Noyce, Richard Brooks, Caroline PLoS One Research Article A key requirement for longitudinal studies using routinely-collected health data is to be able to measure what individuals are present in the datasets used, and over what time period. Individuals can enter and leave the covered population of administrative datasets for a variety of reasons, including both life events and characteristics of the datasets themselves. An automated, customizable method of determining individuals’ presence was developed for the primary care dataset in Swansea University’s SAIL Databank. The primary care dataset covers only a portion of Wales, with 76% of practices participating. The start and end date of the data varies by practice. Additionally, individuals can change practices or leave Wales. To address these issues, a two step process was developed. First, the period for which each practice had data available was calculated by measuring changes in the rate of events recorded over time. Second, the registration records for each individual were simplified. Anomalies such as short gaps and overlaps were resolved by applying a set of rules. The result of these two analyses was a cleaned set of records indicating start and end dates of available primary care data for each individual. Analysis of GP records showed that 91.0% of events occurred within periods calculated as having available data by the algorithm. 98.4% of those events were observed at the same practice of registration as that computed by the algorithm. A standardized method for solving this common problem has enabled faster development of studies using this data set. Using a rigorous, tested, standardized method of verifying presence in the study population will also positively influence the quality of research. Public Library of Science 2020-02-11 /pmc/articles/PMC7012444/ /pubmed/32045428 http://dx.doi.org/10.1371/journal.pone.0228545 Text en © 2020 Thayer et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Thayer, Daniel
Rees, Arfon
Kennedy, Jon
Collins, Huw
Harris, Dan
Halcox, Julian
Ruschetti, Luca
Noyce, Richard
Brooks, Caroline
Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title_full Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title_fullStr Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title_full_unstemmed Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title_short Measuring follow-up time in routinely-collected health datasets: Challenges and solutions
title_sort measuring follow-up time in routinely-collected health datasets: challenges and solutions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7012444/
https://www.ncbi.nlm.nih.gov/pubmed/32045428
http://dx.doi.org/10.1371/journal.pone.0228545
work_keys_str_mv AT thayerdaniel measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT reesarfon measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT kennedyjon measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT collinshuw measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT harrisdan measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT halcoxjulian measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT ruschettiluca measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT noycerichard measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions
AT brookscaroline measuringfollowuptimeinroutinelycollectedhealthdatasetschallengesandsolutions